US 12,170,081 B1
Speech brain-computer interface neural decoding systems based on chinese language and implementation methods thereof
Guangjian Ni, Tianjin (CN); Ran Zhao, Tianjin (CN); Yanru Bai, Tianjin (CN); Hongxing Liu, Tianjin (CN); Mingkun Guo, Tianjin (CN); Qi Zheng, Tianjin (CN); and Qi Tang, Tianjin (CN)
Assigned to TIANJIN UNIVERSITY, Tianjin (CN)
Filed by TIANJIN UNIVERSITY, Tianjin (CN)
Filed on Jun. 25, 2024, as Appl. No. 18/754,146.
Claims priority of application No. 202311395030.2 (CN), filed on Oct. 26, 2023.
Int. Cl. G10L 15/18 (2013.01); G10L 13/027 (2013.01); G10L 13/047 (2013.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/24 (2013.01); G10L 25/18 (2013.01)
CPC G10L 15/1815 (2013.01) [G10L 13/027 (2013.01); G10L 13/047 (2013.01); G10L 15/063 (2013.01); G10L 15/16 (2013.01); G10L 15/24 (2013.01); G10L 25/18 (2013.01)] 9 Claims
OG exemplary drawing
 
1. A brain-computer interface (BCI) control system, comprising:
an electroencephalography (EEG) data acquisition module configured to collect EEG data during speech imagery;
a significance feature screening and verification module configured to perform feature extraction on features from the EEG data, verify separability of the features, and screen the features to obtain EEG data with a specific frequency band or EEG data within a brain region;
a speech imagery EEG data decoding module configured to obtain, by inputting the EEG data with the specific frequency band or EEG data within the brain region into a speech imagery semantic decoder for decoding and reconstructing, speech spectrum information, wherein the speech imagery semantic decoder includes a spatial attention layer, a convolutional layer, a subject layer, a convolutional block, a BatchNorm layer, a GELU activation layer, and two 1×1 convolutional layers; and
an understandable speech synthesis module configured to synthesize the speech spectrum information into real speech using a speech synthesis technology;
wherein a decoding process of the speech imagery semantic decoder includes:
remapping the EEG data with the specific frequency band or the EEG data within the brain region onto the spatial attention layer;
outputting the EEG data with the specific frequency band or the EEG data within the brain region to the convolutional layer for convolution via Fourier spatial parameterization of each output channel of the spatial attention layer, respectively, and reducing dimensionality of the EEG data at the same time;
aligning EEG signals in a common space by training using the subject layer;
performing convolution by using the convolutional block;
halving a count of channels using the BatchNorm layer and the GELU activation layer; and
utilizing an inter-object variability by applying the two 1×1 convolutional layers, outputting a matching speech representation, and utilizing a wav2vec 2.0 speech algorithm for automatic learning to obtain the speech spectrum information.