Jim Liddle at Nasuni explains how firms can cope with the data mountain and make AI a more practical proposition
A car insurer finds that claimants routinely add dashcam footage and accident diagrams to claims but its IT function is struggling to keep up with this ever-growing unstructured data. This means its claims teams can’t assess all the details in a way that accelerates response times.
Meanwhile, a hospital is unable to confirm a diagnosis of a rare condition suggested by earlier tests because its multi-disciplinary team doesn’t have the ability to read the relevant MRI scans and X-rays as a consolidated process across multiple locations and IT environments.
IT teams struggling with the ever-growing unstructured data loads are by no means alone: research by analysts IDC found that 27% of organisations cannot keep pace with these different unstructured data (WhatsApp messages, charts embedded in Word docs, video clips, etc.) as they become part of everyday life and are increasingly fed into IT operations. IDC’s research found that only 58% of organisations’ unstructured data is ever reused.
Intelligence and responsiveness
The arrival of multi-modal AI, however, points us towards a new era of data intelligence and responsiveness. Defined by the integration of natural language, vision, and multi-sensory processing into emerging AI systems, this shift promises to redefine how these tools understand, interact with, and navigate the world around them. In doing so, it can transform organisations and individuals’ ability to analyse data and respond to customers.
While single-mode AI excels at tasks related to one data type, multi-modal AI enables more comprehensive understanding and interactions by leveraging cross-modal information. These capabilities allow for more adaptive and human-like AI behaviours, unlocking new possibilities for applications that require understanding across different modalities and information sources.
New possibilities, new efficiencies
GenAI hype has missed the important point that multi-modal AI capabilities can enable organisations to successfully bring unstructured data into AI strategies and make their complex processes and multi-stakeholder operations more effective. Analysts have forecast that the multi-modal AI market will grow at more than 30% per year between now and 2030, as companies identify wider applications for it.
There are exciting potential applications for these multi-modal AI tools. Teams can access information previously trapped within documents in siloed structured- unstructured formats (e.g. legal firms simplifying time-hungry tasks such as discovery). This can reveal previously hidden data insights that will enable further streamlining of business operations - such as automating data processing and reducing the workforce resources required for manual data handling (e.g. manufacturers publishing technical manuals with embedded diagrams to make equipment maintenance training more focused and less time-consuming).
Multi-modal AI tools can enhance corporate data strategies because they enable organisations to integrate wider and more compelling data sources into processes in unified ways (e.g. hospitals being able to interpret patient records including ad hoc notes, test results and medical images). These capabilities also help IT and lines of business to simplify compliance and data governance tasks - for example, banks using deeper analysis to accelerate loan approvals and reduce fraudulent transactions.
Multi-modal AI is unleashing the possibilities of contextual data for innovation. Organisations can develop new services by uncovering wider insights from unstructured data such as retailers pursuing truly individualised customer marketing strategies, etc. In construction, multi-modal AI is already helping analyse building information (BIM) models, satellite images, and sensor data, to enhance site selection and design and construction processes to ensure more efficient and sustainable construction projects.
Safe development
However, multi-modal AI also brings increased complexity in model development, data integration, and ethical considerations than single-modal systems. Company CIOs will need to follow some crucial steps to derisk multi-modal AI implementations.
First, IT teams will need to assess their data infrastructures, ensuring that unstructured data sets are consolidated and made available to these new tools. Not for nothing do we say that a customer strategy is underpinned by an effective data strategy.
Second, organisations will need to develop data and risk management strategies that fully integrate multimodal data into data governance policies. For example, industries such as publishing are already well-versed in AI tools and can see wide-ranging content generation opportunities from multi-modal AI. But they must also be privy to the risks from the use of unauthorised images or malign instructions that could cause unpredictable behaviours in image-to-text AI systems.
Third, CIOs will have to trial and assess the capabilities of different vendors’ multimodal AI tools, to successfully analyse images, charts and video embedded in documents, across different on-premise and cloud-based IT environments.
Fourth, CIOs will need to design pilot multimodal projects, such as automating the extraction of data from documents with embedded visuals, to prove the ROI case from an application. And fifth, IT teams will need to factor in the repeated iteration of new applications and carefully scale them up to enterprise-wide level.
New possibilities
Multi-modal AI is leveraging wide-ranging text, image and video information to enable a deeper understanding of customers and more targeted interactions with them. Suitably harnessed and derisked, these capabilities will unlock new business processes and applications that demand consistent human-like analysis and understanding across multiple corporate locations, technology infrastructures, and information sources.
Jim Liddle is Chief Innovation Officer, Data Intelligence & AI, at Nasuni
Main image courtesy of iStockPhoto.com and Andy
© 2024, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543