WO 2011/021128 A2 (US 2012/0141000) discloses a method and a system for image analysis, including:
obtaining a sequence of images;
performing a vision-based analysis on at least one of the sequence of images to obtain data for classifying a state of a subject represented in the images;
determining at least one value of a physiological parameter of a living being represented in at least one of the sequence of images, wherein the at least one value of the physiological parameter is determined through analysis of image data from the same sequence of images from which the at least one image on which the vision-based analysis is performed is taken; and
classifying a state of the subject using the data obtained with the vision-based analysis and the at least one value of the physiological parameter.
The document further discloses several refinements of the method and the system. For instance, the use of remote photoplethysmographic (PPG) analyses is envisaged.
Basically, photoplethysmography and related vision-based imaging approaches are considered as conventional techniques which can be used to detect physiological information and, based thereon, vital signals or parameters in a subject of interest. Typically, the vital parameters are derived in a mediate way. Vital parameter detection can be based on the detection of volume changes of organs or organ components in a living being (or: subject of interest). More specifically, in some cases, photoplethysmography can be considered as an optical technique which can be utilized to detect blood volume changes in the microvascular portion of the subject's tissue. Typically, photoplethysmographic measurements are directed at the skin surface of the subject. Conventionally known PPG approaches include so-called contact PPG devices which can be attached to the skin of the subject, for instance to a finger tip. Generally, the detected PPG signal (or: waveform) typically comprises a pulsatile physiological waveform attributable to cardiac synchronous changes in the blood volume with every heartbeat. Besides this, the PPG waveform can comprise further physiological information attributable to respiration, oxygen saturation, Traube-Mayer-Hering waves, and even to further physiological phenomena.
Recently, so-called remote photoplethysmography has made enormous progress in that unobtrusive non-contact remote measurements based on conventional cameras have been demonstrated. The term “conventional camera” may refer to off-the-shelf cameras, for instance digital video cameras, digital (photo) cameras providing video recording functionality, or even to integrated cameras in desktop computers, mobile computers, tablets and further mobile devices, such as smartphones. Furthermore, so-called webcams attachable to computing devices may be covered by the term “conventional camera”. Furthermore, also medical monitoring devices, video conferencing systems and security surveillance devices can make use of standard camera modules.
Typically, these cameras can comprise responsivity (or: sensitivity) characteristics adapted to the visible portion of the electromagnetic spectrum. As used herein, visible radiation may be defined by the general radiation perception ability of the human eye. By contrast, non-visible radiation may refer to spectral portions which are not visible to a human's eye, unless optical aid devices converting non-visible radiation into visible radiation are utilized. Non-visible radiation may relate to infrared radiation (or: near-infrared radiation) and to ultraviolet (UV) radiation. It should be understood that in some cases, conventional cameras may also be sensitive to non-visible radiation. For instance, a camera's responsivity range may cover the whole visible spectrum and also adjacent spectral portions belonging to the non-visible spectrum or, at least, to a transition area between the visible and the non-visible spectrum. Still, however, exemplarily referring to night vision applications and thermal imaging applications, also cameras primarily directed at non-visible portions of the electromagnetic spectrum can be envisaged.
Nowadays, digital technology gains even further significance in everyday life. By way of example, images and sequences thereof are digitally recorded, processed and reproduced, and can be duplicated without loss. An individual may be confronted with digital imaging devices in public (e.g., traffic monitoring, security monitoring, etc.), in private life (e.g., mobile phones, mobile computing devices including cameras), when doing sports or work-outs (e.g., heart rate monitoring, respiration monitoring applying remote PPG techniques), at work (e.g., vision-based machine or engine monitoring, fatigue monitoring or drowsiness monitoring, vision-based access control, etc.), and even in healthcare environments (e.g., patient monitoring, sleep monitoring, etc.). Consequently, regardless of whether the individual is aware or unaware of being monitored in the individual case, a huge amount of (image) data can be gathered in everyday life.
It is an object of the present invention to provide a system and a method for processing data addressing the above-mentioned issues and enhancing privacy preservation while still allowing for an extraction of vital signals from the recorded data. Furthermore, it would be advantageous to provide a system and a corresponding method configured for hiding privacy related information which is not necessarily essential to the vital signal extraction. It would be further desirable to provide a computer program configured for implementing said method.
In a first aspect of the present invention a device for processing data derivable from remotely detected electromagnetic radiation emitted or reflected by a subject is presented, the data comprising physiological information, the device comprising:
a signal detector unit configured for receiving an input signal and for transmitting indicative entities thereof, the indicative entities being indicative of physiological information representative of at least one vital parameter in a subject of interest; and
a processing unit configured for extracting the at least one vital parameter from a transmitted signal comprising the indicative entities, wherein the at least one vital parameter is extracted under consideration of detected skin-colored properties representing circulatory activity;
wherein the signal detector unit is further configured for detecting the indicative entities under consideration of at least one defined descriptive model describing a relation between physical skin appearance characteristics and a corresponding representation in the input signal such that non-indicative side information represented by non-indicative entities in the input signal is substantially undetectable in the resulting transmitted signal.
The present invention addresses privacy preservation issues by treating non-indicative entities in the input signal in such a way that substantially no conclusions regarding personal or privacy information can be drawn therefrom. It is acknowledged in this connection that the indicative signal entities indeed may also comprise privacy-related information. However, the indicative entities are considered essential for the vital parameter extraction the device is targeted at. Basically, the indicative entities may represent skin portions of the subject of interest. It is worth mentioning in this connection that also a plurality of subjects can be present in or represented by the input signal. Consequently, also the indicative entities may be representative of the plurality of subjects.
The non-indicative entities may cover surroundings or environmental information. The non-indicative entities may further comprise non-indicative information (in terms of the at least one vital parameter of interest) which is still closely related to the observed subject's privacy. This may involve clothing information and housing information. Consequently, also the so-called non-indicative side information can comprise privacy information. By way of example, the non-indicative side information may indicate an individual's personal wealth status. Furthermore, a usual place of residence or a current whereabout might be extracted from the non-indicative side information. It is therefore considered beneficial that the non-indicative entities are substantially disregarded during further processing.
It is understood that also the indicative entities in the input signal may comprise personal or privacy information. Still, given that signal portions formed by the indicative entities can basically be taken out of the overall context or representation observed by the signal detector unit, privacy preservation can be improved. By way of example, primarily indicative skin portions of the subject of interest can remain in the resulting transmitted signal. Assuming that clothing information, housing information and further side information representative of surroundings are no longer detectable in the resulting transmitted signal, the risk of privacy information losses or even privacy information misuse can be reduced significantly.
As used herein, in some embodiments, the term “circulatory activity” may refer to cardiovascular activity or, in general, to vascular activity. It should be understood that also respiratory activity is closely related to vascular activity. Consequently, the at least one vital parameter can represent the subject's heartbeat, heart rate, heart rate variability, respiration rate, respiration rate variability, pulse oxygen saturation, and suitable derivates and combinations thereof. In a preferred embodiment the processing unit can make use of photoplethysmographic or, even more preferred, remote photoplethysmographic approaches. Basically, circulatory activity of the subject can be monitored in a mediate way through observing the subject's skin. Slight fluctuations of skin appearance, such as skin color fluctuations, can be attributed to vascular activity, for example.
As used herein, each of the terms “indicative entities” and “non-indicative entities” may refer to particular signal fractions or elements (in terms of an observed area). It is worth noting that each of the indicative entities and the non-indicative entities may refer to at least a single area element or to a set of area elements. By way of example, given that the input signal is encoded in (digitized) data representative of vision-based information or image information, the respective entities may refer to at least a pixel or to a set of pixels. For instance, when a sequence of signal samples (or: image samples) is processed, each of the indicative entities and the non-indicative entities may be formed of respective portions in the samples. Assuming that the input signal is still embodied in form of electromagnetic radiation, the indicative entities and the non-indicative entities may refer to respective portions of the observed area. Also in this case both the indicative entities the non-indicative entities can be formed of a respective signal sub-portion or of a respective set of sub-portions.
The at least one defined descriptive model can be embodied by a model for (directly or indirectly) describing skin in terms of electromagnetic radiation. It is preferred in some embodiments that the descriptive model is a human skin representation model or, more specifically, a human skin color model.
According to a further aspect, the signal detector unit comprises at least one color filter element comprising a filter response adapted to spectral properties corresponding to the at least one descriptive model. The at least one filter element can be embodied by at least one optical filter. The at least one color filter element can also comprise a color filter array comprising a plurality of single filter elements. The at least one color filter element can be configured in such a way that basically indicative entities may pass the respective filter element while non-indicative entities are blocked, or at least, attenuated. It is preferred that the at least one filter element is configured for stopping non-indicative entities. By way of example, the at least one color filter element can comprise filter characteristics adapted to skin color properties. In this way, skin-indicative entities may pass while non-skin entities can be blocked, suppressed, or stopped.
The at least one color filter element can be formed by at least one optical lens filter, for example. In the alternative, the at least one color filter element can be embodied by at least one semiconductor optics filter element. By way of example, the at least one color filter element can be embodied by a Bayer filter array making use of a plurality of semiconductor filters. In this way, incoming signals can be filtered at the level of the sensor device before being converted into digital data. Therefore, no digital representation or, if at all, merely a manipulated representation of the non-indicative entities can be encoded or present in captured digital signals. In other words, the device can make use of a sensor means to which a respective input filter element is coupled which filters input radiation such that basically skin-indicative entities may pass, while non-skin entities are blocked, or, at least, attenuated.
According to yet another aspect the input signal comprises an input sequence of signal samples, wherein the signal detector unit comprises at least one data processing detector configured for processing respective signal samples of the input sequence under consideration of spectral information embedded in signal sample entities, thereby generating a transmitted signal sequence, wherein the at least one data processing detector is further configured for detecting the indicative entities under consideration of the at least one descriptive model describing a relation between physical appearance characteristics and a corresponding data representation in the signal samples.
This embodiment can make use of digital data processing of an input sequence already captured and encoded (into digital data) in advance. A suitable analogue-digital converter can be formed by a respective sensor means, for instance, a camera. To this end, for instance, CCD-cameras and CMOS-cameras can be envisaged. Consequently, a basically discrete sequence of signal samples (or: frames) can be captured and delivered to the signal detector unit.
In this embodiment, each entity may comprise at least a single pixel or a set of pixels. It is worth mentioning in this connection that, by detecting or identifying indicative pixels in the signal samples, vice versa, also the non-indicative entities can be identified, at least in a mediate way. Consequently, each of the signal samples in the input sequence can be segmented into at least one indicative portion and at least one non-indicative portion. It is preferred that the at least one non-indicative portion (which indeed can be indicative of privacy information) is excluded from further signal processing or distribution measures.
The at least one descriptive model may provide a link between physical skin appearance characteristics in terms of electromagnetic radiation characteristics and a corresponding digital data representation making use of signal encoding conventions for visual signals in digital data.
According to yet another aspect the device may further comprise a masking unit configured for masking respective non-indicative entities in the transmitted signal sequence, wherein the data processing detector is further configured for classifying entities into one of an indicative state and a non-indicative state. For instance, the data processing detector can be configured to flag respective pixels in the signal samples. In this way, at least one of an indicative state and a non-indicative state can be assigned to respective pixels and, consequently, to respective entities. To this end, the data processing detector can make use of a skin classifier or, more particularly, of a skin pixel classifier.
Eventually, a transmitted signal can be obtained which is based on the input signal sequence and still comprises indicative entities or portions. On the contrary, the transmitted signal sequence may further comprise masked portions or entities which may replace non-indicative entities. By way of example, the masking unit can be configured for assigning a constant (color) value to non-indicative entities. Furthermore, in an alternative embodiment, the masking unit can be configured for blurring non-indicative pixels, or, more preferably, non-indicative portions comprising a plurality of non-indicative pixels. Typically, blurred portions may sufficiently hide underlying privacy-related information. As used herein, the term blurring may refer to various image manipulating measures directed at reducing (privacy) information content. It can be envisaged in this connection that further image or data manipulating measures can be applied to non-indicative portions of the signal samples so as to hide respective privacy information.
According to yet an even further aspect the signal samples are encoded under consideration of a signal space convention applying a color model, wherein an applied signal space comprises complementary channels for representing the entities forming the signal samples, wherein respective components of the entities are related to respective complementary channels of the signal space.
Typically, digital image representation requires an A/D (analogue/digital) conversion under consideration of a predefined conversion convention. In other words, the entities in the signal samples may comply with a signal space convention which basically describes a relation between electromagnetic radiation characteristics and respective signal properties of the entities in the (digital) signal samples. Typically, a signal space or color space may involve a combination of a color model and a respective mapping function which is utilized for data generation. Generally, the signal space may comprise two or more dimensions. A single pixel in a signal sample may be represented by a value or a respective vector (herein also referred to as index element) in the signal space.
In some embodiments, the signal space is an additive color signal space, wherein the complementary channels are additive color channels, wherein the entities are represented by at least three absolute components, wherein the at least three absolute components represent distinct color components indicated by the additive channels, and wherein the additive channels are related to define spectral portions. Such a color space may be embodied by an RGB color space, or by derivates thereof. Furthermore, subtractive color signals spaces can be envisaged, for instance, a CMYK color space, and respective derivates. Still, alternatively, the signal space can be configured as a signal space basically indicative of luminance information and chrominance information. This may apply, for instance, to the YUV color space.
According to a further embodiment the signal space comprises a color representation basically independent of illumination variations. It should be noted in this connection, that also a “reduced” signal space may be utilized for detecting the indicative entities and, respectively, the non-indicative entities in the signal samples. By way of example, a subspace of the YUV signal space can be utilized to this end. Furthermore, signal spaces can be converted into derivative signal spaces in which luminance information is basically disregarded. By way of example, an additive color signal space (such as RGB) can be “mapped” to a chromaticity plane which may provide for color property representation regardless of actual luminance. In this way, luminance normalization can be achieved. Consequently, the detection of the indicative entities can be facilitated. For instance, respective R-values, G-values and B-values of the RGB signal space can be divided by a predefined linear combination of R, G and B, respectively. Such a normalization can further provide for a dimensional reduction. Preferably, the descriptive model or skin model makes use of the signal space in that an underlying vision-based skin model is defined and expressed in terms of the respective signal space convention.
According to another aspect it is further preferred that the descriptive model is a skin color model describing skin representation under consideration of signal space conventions. By way of example, the descriptive model can make use of look-up table data for comparison measurement and classification. The look-up table data may comprise a variety of predefined values representing indicative entities. According to one embodiment, the descriptive model is an explicit skin model. An explicit skin model may comprise a defined subspace of a signal space which is considered attributable to a representation of the subject's skin, for example. However, in the alternative, the descriptive model can be at least one of a non-parametric skin model and a parametric skin model.
By way of example, a non-parametric skin model can be based on a look-up table comprising a plurality of histograms representing a variety of reference measurements. A parametric skin model can make use of a simplified function-type representation of classification data in the signal space. By way of example, based on histograms obtained through reference measurements, Gaussian functions can be defined for describing a probability distribution with regard to whether or not a given entity (or: pixel) can be considered as an indicative entity or a non-indicative entity. In this connection, single Gaussian and multiple Gaussian functions can be envisaged.
According to another advantageous embodiment, the masking unit is further configured for processing the indicative entities such that the at least one vital parameter is substantially detectable in the transmitted signal sequence, wherein non-indicative side information represented by the indicative entities is at least partially attenuated in the transmitted signal sequence. In this context, processing the indicative entities may involve blurring sets of indicative entities.
This embodiment is based on the idea that also the indicative entities can be processed so as to further enhance privacy preservation. It is preferred in this connection that processing parameters are chosen such that the to-be-extracted vital parameter is substantially preserved in the processed samples. Since vital parameter extraction may involve spatially averaging indicative regions of interest, blurring operations or similar algorithms can be applied to the indicative entities, provided that respective average values of interest (e.g., spatial mean pixel color values) remain substantially unchanged. By way of example, a blurring algorithm (e.g., Gaussian blur) can be applied to the indicative entities. In this way, privacy-related information, such as skin details, etc., can be diminished or attenuated while vital parameter-indicative information can be preserved, that is, for instance, a mean pixel color in an indicative region of interest is not affected. Consequently, mean pixel color fluctuations can be preserved for further analysis. By way of example, spatial blurring involving selective filter algorithms may be applied to the regions comprising the indicative entities.
It is further preferred in this connection that blurring parameters (e.g., blurring filter characteristics) are chosen such that indicative entities or sets of indicative entities comprising high contrast (huge differences in luminance and/or color) are excluded from blurring operations. High contrast areas may adversely influence average values and may therefore distort processed vital parameter-representative signals.
It is further envisaged to apply blurring operations or similar image processing operations to both the indicative entities and the non-indicative entities. In this connection, however, it is preferred that regions comprising the indicative entities and regions comprising the non-indicative entities are processed separately so as to avoid blending or mixing up indicative entities and non-indicative entities.
According to yet another aspect, the device further comprises a database providing a plurality of descriptive models attributable to an influence parameter selected from the group consisting of skin color type, ethnic region, ethnic group, body region, sex, sensor unit characteristics, and illumination conditions, and combinations thereof.
According to this approach the device can make use of a descriptive model currently considered suitable for an actual monitoring environment. The plurality of descriptive models may comprise a plurality of non-parametric skin models. In this case, even though non-parametric skin models can be considered somewhat inflexible or static, the device as a whole can be adapted to varying monitoring conditions.
According to an alternative exemplary aspect the signal detector unit is further configured for adapting the present descriptive model under consideration of an influenced parameter selected from the group consisting of skin color type, ethnic region, ethnic group, body region, sex, sensor unit characteristics, and illumination conditions, and combinations thereof. By way of example, a parameter of a parametric skin model can be adjusted accordingly so as to adapt the descriptive model to given monitoring conditions. It is worth mentioning in this connection that the above influence parameters basically may influence the appearance and the perception of skin colors and, therefore, may also influence accuracy of the indicative entity detection.
According to still yet a further aspect the device further comprises a sensor unit, in particular a camera, configured for sensing electromagnetic radiation at a distance, wherein the sensor unit is coupled to the signal detector unit such that non-indicative entities in the input signal are basically disregarded when transmitting respective signal samples. By way of example, a camera can be integrated in the device such that no external excess to a captured input sequence is allowed.
According to yet another aspect the sensor unit comprises a response characteristic adapted to the descriptive model such that non-indicative entities are basically disregarded when capturing the signal samples. In this connection, the detector unit and the masking unit can be implemented in the camera, for instance, via optical components, (digital) data processing components, and combinations thereof.
According to yet another aspect the device may further comprise an output interface for distributing the sequence of processed samples. The sequence of processed samples in which non-indicative entities are basically undetectable can be forwarded, distributed or copied without the risk of revealing non-indicative side information.
According to a further aspect the device further comprises a feature detector configured for detecting identity-related prominent features in signal samples of the input sequence, and a feature masking unit configured for masking respective entities in the transmitted signal sequence. This embodiment may even further contribute in enhancing privacy preservation. By way of example, the feature detector can be configured for applying eye recognition, face recognition, mouth recognition, hair recognition, and combinations thereof. Since primarily skin portions in the subject of interest are addressed, additional prominent features which may be considered even further indicative of privacy-related information (rather than of vital parameters of interest) can be detected and removed from the respective signal sample.
According to yet an even further aspect the processing unit is further configured as photoplethysmographic processing unit capable of extracting the at least one vital parameter of interest from the sequence of transmitted samples, wherein the at least one vital parameter can be extracted under consideration of vascular activity represented by skin color properties.
There exist several embodiments of the signal detector unit and the processing unit and, if any, of respective subcomponents thereof. In a first, fairly simple embodiment the signal detector unit and the processing unit as well as their respective (data processing) subcomponents can be embodied by a common processing device which is driven (or: controlled) by respective logic commands. Such a processing device may also comprise suitable input and output interfaces. However, in the alternative, each of the signal detector unit and the processing unit can be embodied by separate processing devices controlled or controllable by respective commands. Hence, each respective processing device can be adapted to its special purpose. Consequently, a distribution of tasks can be applied, wherein distinct tasks are processed (or: executed) on a single processor of a multi-processor processing device or, wherein image processing related tasks are executed on an image processor while other operational tasks are executed on a central processing unit. The above may also refer to subcomponents of the signal detector unit and the processing unit. Each of the processing unit, the signal detector unit and their respective subcomponents can be implemented as a virtual part of a processing environment or as a discrete (e.g., hardware-coded) processing element. Hybrid implementations can be envisaged.
In a further aspect of the invention a method for processing data derivable from remotely detected electromagnetic radiation emitted or reflected by a subject is presented, the data comprising physiological information, the method comprising the steps of:
receiving an input signal and transmitting indicative entities thereof, the indicative entities being indicative of physiological information representative of at least one vital parameter in a subject of interest;
detecting the indicative entities under consideration of at least one defined descriptive model describing a relation between physical skin appearance characteristics and a corresponding representation in the input signal such that non-indicative side information represented by non-indicative entities in the input signal is substantially undetectable in a resulting transmitted signal; and
extracting the at least one vital parameter from the transmitted signal comprising the indicative entities, wherein the at least one vital parameter is extracted under consideration of detected skin color properties representing circulatory activity.
Advantageously, the method can be carried out utilizing the device for extracting information of the invention.
In yet another aspect of the present invention, there is provided a computer program which comprises program code means for causing a computer to perform the steps of the method when said computer program is carried out on a computer. The program code (or: logic) can be encoded in one or more non-transitory, tangible media for execution by a computing machine, such as a computer. In some exemplary embodiments, the program code may be downloaded over a network to a persistent storage from another device or data processing system through computer readable signal media for use within the device. For instance, program code stored in a computer readable storage medium in a server data processing system may be downloaded over a network from the server to the device. The data processing device providing program code may be a server computer, a client computer, or some other device capable of storing and transmitting program code.
As used herein, the term “computer” stands for a large variety of processing devices. In other words, also mobile devices having a considerable computing capacity can be referred to as computing device, even though they provide less processing power resources than standard desktop computers. Furthermore, the term “computer” may also refer to a distributed computing device which may involve or make use of computing capacity provided in a cloud environment.
Preferred embodiments of the invention are defined in the dependent claims. It should be understood that the claimed method and the claimed computer program can have similar preferred embodiments as the claimed device and as defined in the dependent device claims.