Conventionally, there is a technology for making a user intuitively recognize which area of a plurality of images is displayed that obtained by capturing by a plurality of cameras (hereinafter, referred to as camera images) mounted so as to be able to capture a plurality of areas around the vehicle, by displaying a camera image of an area selected by the user and an image of the vehicle viewed from above (hereinafter, referred to as a bird's eye view image) in an associated manner.