RecipeInfoExtraction – Information Extraction from Unstructured Recipe Data


Food recipes are a common source of information for people willing to learn how to cook new recipes or simply to aid in their food choice. Many web sites exist that allow users to search for recipes which are suitable for their personal needs and motivation. Since the number of recipes available in these web sites ascend to thousands, they usually rely on structured information such as the name, preparation time or ingredients of the recipe, to allow users to search for recipes efficiently. However, structured information of recipes often lacks relevant information for food choice. This includes for instance, the nutritional composition of the recipe which is also an important one as, according to the World Health Organization, nutrition is a cornerstone of good health. A poor diet has a negative impact on mental health and can cause several chronic diseases including diabetes and heart diseases.

The unstructured information in recipe data, such as the preparation instructions, includes information that may be useful to help users finding recipes such as:

> Ingredients and respective quantities;

> Used cooking methods;

> Used food preparation techniques (e.g. mixing, cutting, etc.);

> Necessary equipment and tools.

This information may also be combined with other complementary sources. Food composition databases, for instance, contain extensive information about food, including its nutritional composition. EuroFIR is a European institution that collects information for composition databases compiled in different countries and makes them available in a normalized format.

The aim of this project is to apply text mining and information extraction techniques, to extract information from semi-structured recipe text data, estimating its nutritional composition through automatic association with food composition database entries.


Main Outcomes:

A system capable of automatically extracting structured information from semi-structured data, and automatically estimate the nutritional composition of recipes through automatic association with food composition data.


Author: Nuno Silva

Type: MSc thesis

Partner: Faculdade de Engenharia da Universidade do Porto

Year: 2018



Project Flyer