The invention relates generally to extracting information from text data, and more specifically, to learning object/action pairs for recipe ingredients.
Creating a model for action/object compatibility is challenging in any domain. It can be very time consuming to create a model manually. Generally, learning a model from a data corpus is both faster and more reliable. However, in the cooking domain there are often syntactical inconsistencies that can make it difficult to figure out dividing lines in order to generate rules for a model. For example, in the cooking domain, the action of removing an inedible exterior wrapping from a food item can be called, among other things, skinning, peeling, or shelling. Continuing with the example, bananas grow within peels or skins, shrimp grow within shells, and peas grow within pods. One can remove the skin or peel the banana, however one cannot “shell the banana”; one can shell the shrimp or peel the shrimp but cannot “skin the shrimp”, and one can shell the peas or peel the peas but cannot “skin the peas” or “pod the peas.” These types of semantic subdivisions among what are normally considered to be synonymous terms can make it difficult to automate the process of learning a model to express cooking domain actions.