Printer friendly version
Seeing is understanding – using artificial intelligence to analyse multimedia content
29 April 2010
European Commission, CORDIS
The media produce a glut of material daily. Refining that ore into the gold of useful information requires new approaches. European researchers have now made automated multimedia analysis much smarter.
Picture a few seconds of coverage from a sporting event, say the Wimbledon finals. Your television might show a snippet of action plus the players’ names, scores, and other text scrolling across the screen, while the audio feed might feature expert commentary.
Multiply that multimedia feed by every sporting event being broadcast anywhere in the world. Then toss in all the other activities covered by the media – news, politics, pop culture, not to mention YouTube and other social media. And finally, imagine trying to make sense of this torrent of information so that it can be categorised, labelled, indexed, searched and retrieved as needed.
That’s the challenge that the EU-funded research project BOEMIE (for Bootstrapping Ontology Evolution with Multimedia Information Extraction) accepted in 2006. They’ve now shown that by using state-of-the-art artificial intelligence (AI) techniques to build and then refine highly structured knowledge bases, they can automatically or semi-automatically identify, analyse and index almost any multimedia content.
BOEMIE’s smart toolkit has significant commercial and research potential in any kind of multimedia annotation and retrieval. “Without semantic indexing, it’s very difficult to retrieve multimedia content,” says George Paliouras, BOEMIE’s technical manager. “BOEMIE offers a new approach to do this at a large scale and with high precision.”
By your bootstraps
It’s impossible to pick oneself up by one’s own bootstraps, but BOEMIE manages a close approximation.
Of necessity, BOEMIE needs to start with some knowledge of the domain it will be analysing. That basic knowledge comes from domain experts who are prompted by the BOEMIE Semantic Manager to define and relate key concepts using natural language. For example, the concept “tennis match” might be defined as a type of sporting event, and the concept “Wimbledon finals” might be defined as an example of a tennis match.
BOEMIE automatically organises this information into an ontology – a formal way of representing concepts and the relationships between them within a chosen domain. Many AI applications use ontologies to represent knowledge about specific areas in a systematic and useful way.
When BOEMIE starts to analyse a multimedia feed, it uses newly developed video, image, audio and text analysis tools to extract as much information as it can. From the Wimbledon coverage, for example, it might note that there are two players interacting across a net on a playing surface of a particular size. This might allow it to categorise what it is “viewing” as a tennis match. From the audio track or on-screen text, it might tentatively connect the players with their names.
As the BOEMIE Ontology Evolution Toolkit tries to place the information it extracts into the existing ontology, it’s likely to discover that it needs new concepts. For example, it might notice that the commentator repeatedly uses the word “championship” or the phrase “Grand Slam.” The system automatically proposes these new concepts for the ontology, which can be accepted, rejected or modified by the domain expert.
Much like a human researcher, BOEMIE searches the web for needed information. For example, it might access Wikipedia or other sources to define a Grand Slam event, find out where Wimbledon is located in order to place it on a map, or find biographies of the contestants.
The key bootstrapping cycle is managed by the BOEMIE Bootstrapping Controller. After enriching the knowledge base, the system then re-analyses the same footage, guided by the newly enriched ontology. This lets the system extract even more information and propose still more refinements to the knowledge base.
“This cycle of improvement of our domain knowledge and then going back with that improved knowledge to extract even more knowledge can happen several times,” says Paliouras. “This is the novel aspect of BOEMIE.”
The BOEMIE package also includes a semantic browser that allows non-expert users to search for the multimedia information they need using the concepts and relationships BOEMIE has built up.
Putting BOEMIE through its paces
The BOEMIE researchers decided to test the system in the area of sports, where they knew they could find plenty of multimedia content and would not need to involve specialists to help build the knowledge base.
They found that the combination of BOEMIE’s content analysis tools, the natural language interface and flexible ontology building tool, and their novel bootstrapping approach allowed them to extract information from multimedia sports coverage much more efficiently and accurately than existing automated systems.
Paliouras points out that BOEMIE is not limited to sports. The toolkit can speed and improve the analysis, categorisation, indexing and retrieval of almost any kind of multimedia content. “BOEMIE can add value to any form of multimedia analysis, and make the work of a domain expert easier and more manageable,” he says.
Project coordinator Constantine Spyropoulos notes that a variety of potential customers are interested in implementing parts of the BOEMIE toolkit. The International Association of Athletics Federations wants to boost its content retrieval capabilities using BOEMIE. Advertisers are interested in how BOEMIE can help them reach particular audiences and monitor the exposure of their products. Politicians are intrigued by BOEMIE’s ability to filter a torrent of information to determine what people are saying about a particular issue, and news organisations are exploring how the system can help them.
“The methodology we’ve developed is universal,” says Spyropoulos. “It can apply to any area, any domain.”
The BOEMIE project received funding from the ICT strand of the EU’s Sixth Framework Programme for research.