Hitherto, extraction has been performed for specific scenes included in footage of sports coverage. For example, a device has been proposed for extracting event end points in sports footage for use in digest viewing. In this device, in sports footage with a fixed event start point, cut points are detected from the event start point onward, and types of cut-length between the detected cut points are classified according to the length of the cut-length. Appearance patterns of cut-length types are associated with event end points, and stored in an appearance pattern storage section. Then, an event end point corresponding to the appearance pattern for the classified types is extracted by referencing the appearance pattern storage section, and the extracted event end point is output.