WO2011001693A1 - 語音明瞭度評価システム、その方法およびそのプログラム - Google Patents
語音明瞭度評価システム、その方法およびそのプログラム Download PDFInfo
- Publication number
- WO2011001693A1 WO2011001693A1 PCT/JP2010/004358 JP2010004358W WO2011001693A1 WO 2011001693 A1 WO2011001693 A1 WO 2011001693A1 JP 2010004358 W JP2010004358 W JP 2010004358W WO 2011001693 A1 WO2011001693 A1 WO 2011001693A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- speech
- sound
- user
- event
- unit
- Prior art date
Links
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
- A61B5/377—Electroencephalography [EEG] using evoked responses
- A61B5/38—Acoustic or auditory stimuli
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/12—Audiometering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2225/00—Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
- H04R2225/43—Signal processing in hearing aids to enhance the speech intelligibility
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R25/00—Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
- H04R25/70—Adaptation of deaf aid to hearing loss, e.g. initial electronic fitting
Definitions
- the present invention relates to a technique for evaluating whether or not a speech has been heard. More specifically, the present invention evaluates the degree of speech intelligibility that evaluates the degree of “fitting” in a hearing aid or the like that adjusts the amount of amplification for each sound frequency to obtain a sound of an appropriate magnitude for each user. About the system.
- a hearing aid is a device for amplifying a sound of a frequency that is difficult for the user to hear to compensate for the lowered hearing ability of the user.
- the amount of sound amplification that the user seeks from the hearing aid varies depending on the degree of hearing loss for each user. Therefore, before starting to use the hearing aid, “fitting” that first adjusts the amount of sound amplification in accordance with the hearing ability of each user is essential.
- the fitting is performed with the aim of setting the output sound pressure for each frequency of the hearing aid to MCL (most comfortable level: the sound pressure level that the user feels comfortable). If the fitting is not appropriate, for example, the sound cannot be heard sufficiently due to insufficient amplification, and if it is amplified too much, the user feels noisy.
- the fitting is generally performed based on an audiogram for each user.
- the “audiogram” is a result of evaluating “hearing” of a pure tone. For example, the user can hear each of sounds of a plurality of frequencies (for example, 250 Hz, 500 Hz, 1000 Hz, 2000 Hz, and 4000 Hz). It is the figure which plotted the smallest sound pressure level (decibel value) according to the frequency.
- the “sound intelligibility evaluation” is an evaluation of whether or not the speech is actually heard, and is an evaluation of the ability to hear whether or not a single syllable speech is heard.
- a single syllable word sound indicates one vowel or a combination of a consonant and a vowel (for example, “A” / “DA” / “SH”). Since the purpose of wearing a hearing aid is to distinguish between conversations, the evaluation result of speech intelligibility needs to be emphasized.
- the conventional speech intelligibility evaluation was performed according to the following procedure. First, a single syllable voice is spoken to the user by verbal or CD playback one by one using the 57S type word table (50 single syllables) or 67S type word table (20 single syllables) established by the Japan Audiological Society. Next, the user is made to answer to which speech the presented speech is heard by a method such as utterance or description. Then, the evaluator collates the word table with the answers and calculates the correct answer rate.
- 57S type word table 50 single syllables
- 67S type word table (20 single syllables
- Patent Document 1 discloses a speech intelligibility evaluation method that automatically performs correct / incorrect determination using a personal computer (PC) in order to reduce the burden on the evaluator.
- a single syllable voice is presented to a user using a PC, the user is made to answer by mouse or pen touch, the answer is accepted as an input of the PC, the presented voice and answer input,
- By receiving an answer input with a mouse or a pen touch it is not necessary for the evaluator to decipher and identify the user's answer (speech or description), and the evaluator's effort is greatly reduced.
- Patent Document 2 discloses a speech intelligibility evaluation method that presents a selection candidate of speech corresponding to a speech after the speech is presented in order to reduce the burden of the user's answer input.
- the number of selection candidates is narrowed down to several, and the user's trouble of searching for a character is reduced by selecting a corresponding word sound from among several characters.
- a response input is received using a PC, and the burden on the evaluator is reduced.
- An object of the present invention is to realize a speech sound intelligibility evaluation system that does not require a troublesome answer input for a user.
- the speech intelligibility evaluation system includes a biological signal measurement unit that measures a user's electroencephalogram signal, a presentation speech control unit that determines a speech to be presented by referring to a speech database that stores a plurality of single syllable speech. From the speech output unit that presents the speech sound determined by the presented speech sound control unit as speech and the user's brain wave signal measured by the biological signal measurement unit, the time when the speech was presented is 800 ms ⁇ 100 ms.
- a feature component detection unit that determines the presence or absence of a feature component of the event-related potential, and a speech intelligibility evaluation unit that determines whether or not the user has heard the speech based on the determination result of the feature component detection unit ing.
- the event-related potential is acquired using the electrode position Pz in the international 10-20 method, and the component related to the event-related potential has a predetermined value or more
- the event-related potential is detected by the feature component detection unit.
- the speech intelligibility evaluation unit causes the user to detect the speech.
- the speech intelligibility evaluation unit may determine that the user cannot hear the speech .
- the feature-related detection unit detects the event-related potential.
- the speech intelligibility evaluation unit is When it is determined that the speech has been heard and the feature component detection unit determines that the feature component is present in the event-related potential, the speech intelligibility evaluation unit determines that the user has not heard the speech. May be.
- the speech sound database may store a group relating to speech, consonant information, and an odds occurrence probability for each of a plurality of speech sounds.
- the speech intelligibility evaluation unit may evaluate the speech intelligibility for each speech, for each consonant, or for each group related to the probability of occurrence of abnormal hearing.
- the speech database stores a plurality of speech sets whose frequency gains are adjusted by a plurality of fitting methods, and the speech intelligibility evaluation system switches the speech sets stored in the speech database regularly or randomly. It may further include a fitting technique switching unit that selects one of the plurality of fitting techniques.
- the speech intelligibility evaluation unit determines whether or not the speech has been heard
- the fitting method may be determined to be suitable for the user when the probability that it is determined that the speech has been heard is high.
- Another speech intelligibility evaluation system includes a presentation speech control unit that determines a speech to be presented by referring to a speech database that stores a plurality of single syllable speech, and a speech determined by the presented speech control unit.
- the event-related potential at 800 ms ⁇ 100 ms starting from the time when the voice was presented from the user's brain wave signal measured by the voice output unit that presents the voice and the biological signal measurement unit that measures the user's brain wave signal
- a feature component detection unit that determines presence / absence of a feature component, and a speech intelligibility evaluation unit that determines whether the user has heard the speech based on a determination result of the feature component detection unit.
- the speech intelligibility evaluation method includes a step of measuring a user's brain wave signal, a step of determining a speech to be presented with reference to a speech database holding a plurality of single syllable speech, and the determined speech
- the computer program according to the present invention is a computer program executed by a computer, and the computer program holds a plurality of steps of receiving a measured electroencephalogram signal of a user and a single syllable word sound.
- a step of determining the presence / absence of a characteristic component of an event-related potential at ⁇ 100 ms and a step of determining whether or not the user has heard the word sound based on the determination result are executed.
- the present invention it is possible to quantitatively and automatically evaluate whether or not the user can hear the presented word sound according to the presence or absence of the characteristic component of the electroencephalogram in the center of the user's head after the voice presentation. This eliminates the need for troublesome answer input for the user and makes it possible to evaluate the listening result with less burden on the evaluator and the user.
- FIG. 1 It is a figure which shows the structure and usage environment of the speech intelligibility evaluation system 100 by Embodiment 1.
- FIG. 1 It is a figure which shows the hardware constitutions of the speech intelligibility evaluation apparatus 1 by embodiment. It is a figure which shows the structure of the functional block of the speech intelligibility evaluation system 100 by embodiment. It is a figure which shows the example of speech sound DB71.
- FIG. 4 is a flowchart showing a procedure of processing performed in the speech intelligibility evaluation system 100. It is a figure which shows the example of the speech intelligibility evaluation result in the case of English.
- FIG. 3 is a diagram illustrating an external appearance of a headphone corresponding to an audio output unit 11. It is a figure which shows the structure of the functional block of the speech intelligibility evaluation system 200 by Embodiment 2.
- FIG. It is a figure which shows the example of the speech sound DB72 by Embodiment 2.
- FIG. It is a figure which shows the example which evaluated the speech intelligibility for every speech as a speech intelligibility evaluation result in each of the fitting methods A to C. It is a figure which shows the example of the evaluation result of a fitting method. It is a flowchart which shows the process sequence of the speech intelligibility system 200 by embodiment.
- the speech intelligibility evaluation system is used to evaluate speech intelligibility using brain waves. More specifically, the speech intelligibility evaluation system presents a single syllable speech as a voice and allows the user to distinguish the voice, using the event-related potential of the user's brain wave signal as a starting point. Used to evaluate the distinction of speech.
- “presenting a voice” means outputting an auditory stimulus, for example, outputting a voice from a speaker.
- the kind of speaker is arbitrary. It may be a speaker installed on the floor or stand, or a headphone speaker.
- the inventors of the present application conducted the following two types of experiments in order to realize speech intelligibility evaluation that does not require the user's answer input.
- the inventors of the present application first conducted a behavioral experiment to examine the relationship between the degree of confidence of voice discrimination and the probability of occurrence of abnormal hearing. Specifically, a single syllable speech was presented in the order of voice and characters (Hiragana), and the user was asked whether or not the voice and the characters were the same, and the confidence level of the voice listening was answered with a button. As a result, the inventors of the present application confirmed that the probability of occurrence of abnormal hearing is as low as 10% or less when the confidence level of voice discrimination is high, and that the probability of occurrence of abnormal hearing is high when the confidence level of discrimination is low. .
- the inventors of the present application conducted an experiment to measure the event-related potential from the voice presentation with a setting that reminds the user of the speech of the single syllable and presents the speech corresponding to the voice. Then, the event-related potentials were averaged based on the degree of confidence of discrimination obtained in advance in the behavioral experiment.
- the event-related potential starting from the voice stimulus when the confidence level for the voice discrimination is high compared to the case where the confidence level is low, a positive component is induced in the latency from 700 ms to 900 ms around the center of the head. discovered.
- An event-related potential is a part of an electroencephalogram, and is a transient potential fluctuation in the brain that occurs temporally in relation to an external or internal event. Here, it means a potential fluctuation related to the presented voice.
- “latency” indicates the time until the peak of the positive component or negative component appears from the time when the voice stimulus is presented.
- speech intelligibility can be evaluated based on the degree of confidence in distinguishing speech that can be determined based on the presence or absence of positive components in the event-related potential latency from 700 ms to 900 ms starting from speech presentation. .
- speech intelligibility evaluation was based only on whether or not the user's answer was correct, but whether or not the user thought that the user was able to hear the voice, rather than whether or not the voice was actually heard correctly by this method.
- the speech intelligibility evaluation based on kana is realized.
- the experiment participants were 6 university / graduate students with normal hearing.
- Fig. 1 shows the outline of the experimental procedure for behavioral experiments.
- Stimulating speech sounds refer to the “concept of hearing aid fitting” (Kojiro Kodera, Diagnosis and Treatment, 1999, p. 172). You selected from a pair of YA rows or a pair of KA / TA rows. The participants were instructed to hear the voice and to think of the corresponding hiragana. In the participants with normal hearing ability, three conditions of voice with processed frequency gain were presented so that the degree of confidence of each voice was dispersed. (1) 0 dB condition: The frequency gain was not processed as an easily audible voice.
- FIG. 2 shows the gain adjustment amount for each frequency in each of the conditions (1) to (3).
- the reason for reducing the frequency gain of the high frequency is to reproduce a typical pattern of hearing loss in elderly people. In general, elderly deaf people often have difficulty hearing high-frequency sounds. By reducing the frequency gain of the high frequency, it is possible for the normal hearing person to simulate hearing that is equivalent to difficulty in hearing for the elderly hearing impaired person.
- Procedure B is a button press for proceeding to Procedure C, and was added in order to present the text stimulus of Procedure C at the participant's pace in the experiment. This button is also referred to as the “Next” button.
- Step C a single hiragana character was presented on the display. Characters that match the voice presented in Procedure A as matching trials and hiragana characters that do not match voices as mismatching trials were each shown with a probability of 0.5. Hiragana characters that do not match generally have a line of Na and Ma, Ra and Ya, and Ka and Ta. For example, when hiragana “NA” was presented in procedure A, “NA” was presented in procedure C in the matching trial, and “MA” was presented in procedure C in the mismatching trial.
- Procedure D is a button press (numbers 1 to 5 on the keyboard) for confirming how much the participant feels a mismatch between the voice presented in Procedure A and the characters presented in Procedure C. 5 if you feel an absolute match, 4 if you feel a match, 3 if you don't know, 2 if you feel a disagreement, 1 if you feel an absolute disagreement Each was pushed.
- 5 or 1 is pressed in this button press, the participant is divided into a correct answer and an incorrect answer (occurrence of abnormal hearing) at the stage of procedure C, but when he hears the voice presented at the stage of procedure A Then, it can be said that he was confident in the discrimination. Similarly, if 2 to 4 are pressed, it can be said that the participant was not confident in hearing the voice.
- FIG. 3 is a flowchart showing the procedure for one trial. In this flowchart, for the convenience of explanation, both the operation of the apparatus and the operation of the experiment participant are described.
- Step S11 is a step of presenting a single syllable voice to the experiment participant.
- Step S12 is a step in which a participant hears a single syllable voice and thinks of a corresponding hiragana.
- Hiragana is a character (phonetic character) representing pronunciation in Japanese. In the case of English or Chinese as described later, for example, a character string or phonetic symbol of a single syllable word corresponds to a hiragana character.
- Step S13 is a step in which the participant presses the space key as the next button (procedure B).
- Step S14 is a step in which Hiragana characters that match or do not match the voice are presented on the display with a probability of 50% starting from Step S13 (Procedure C).
- Step S15 is a step of confirming whether the hiragana conceived by the participant in step S12 matches the hiragana presented in step S14.
- Step S16 is a step in which the number of keys 1 to 5 is used to answer how much the participant feels a match / mismatch in step S15 (procedure D).
- FIG. 4 is a diagram showing the degree of confidence in the voice recognition of the participants classified according to the result of the button press and the probability of correct / incorrect button press.
- the confidence level of the discrimination was classified as follows. When 5 (absolute coincidence) or 1 (absolute disagreement) was pressed, the confidence level of discrimination was “high”. The probability that the degree of confidence was “high” was 60.4% of all trials (522 trials out of 864 trials). When 4 (probably coincident), 3 (not sure), or 2 (probably inconsistent) was pressed, the confidence level of discrimination was set to “low”. The probability that the degree of confidence was “low” was 39.6% of the total trials (342 trials out of 864 trials).
- the correctness of the button press was determined by the match / mismatch between the voice and the character and the button pressed. If 5 (absolute match) or 4 (probably match) is pressed in the match trial, and 1 (absolute mismatch) or 2 (probably mismatch) is pressed in the mismatch trial, it is determined to be positive. .
- Fig. 4 (a) shows the correct / wrong result of pressing a button in a trial with a high degree of confidence. It can be seen that the correct button was selected in almost all trials (92%). This indicates that when the confidence level of discrimination is high, the voice can be correctly recognized. From this result, it can be said that when the confidence level of the discrimination is high, the speech intelligibility is high.
- Fig. 4 (b) shows the correct / wrong result of pressing a button in a trial with low discrimination confidence. It can be seen that there is a high probability that the wrong button was pressed (42%). This indicates that abnormal hearing is likely to occur when the degree of confidence of discrimination is low. From this result, it can be said that when the confidence level of the discrimination is low, it can be evaluated that the speech intelligibility is low.
- the behavioral experiment has revealed that it is possible to achieve speech intelligibility evaluation based on the user's degree of confidence in listening to speech.
- the degree of confidence of hearing can be measured by a method other than pressing a button, speech intelligibility evaluation without answer input can be realized based on the index.
- the inventors of the present application paid attention to the event-related potential of the electroencephalogram, and conducted an electroencephalogram measurement experiment to examine whether or not there is a component that reflects the difference in the degree of confidence of discrimination for speech.
- an electroencephalogram measurement experiment will be described.
- Electroencephalogram measurement experiment The inventors of the present application conducted an electroencephalogram measurement experiment in order to examine the relationship between the degree of confidence of voice discrimination and the event-related potential after voice presentation.
- FIGS. 5 to 9 experimental settings and experimental results of the conducted electroencephalogram measurement experiment will be described.
- the experiment participants were 6 university / graduate students who were the same as those in the behavioral experiment.
- FIG. 5 is a diagram showing electrode positions in the international 10-20 method.
- the sampling frequency was 200 Hz and the time constant was 1 second.
- a 1-6 Hz digital bandpass filter was applied off-line.
- As an event-related potential for voice presentation a waveform from ⁇ 100 ms to 1000 ms was cut out from the voice presentation.
- the average of event-related potentials was calculated based on the degree of confidence of hearing for each participant and for each speech sound under all the conditions (0 dB ⁇ ⁇ 25 dB ⁇ ⁇ 50 dB) in the above behavioral experiment.
- FIG. 6 shows an outline of the experimental procedure of the electroencephalogram measurement experiment.
- procedure X a single syllable speech was presented. Stimulus speech sounds are similar to behavioral experiments, referring to “Hearing Aid Fitting Concept” (Kazuko Kodera, Diagnosis and Treatment Company, 1999, p. 172). A pair was selected from a pair, a la / ya line pair, and a ka / ta line pair. The participants were instructed to hear the voice and to think of the corresponding hiragana. In addition, similar to the behavioral experiment, voices with the following three conditions were presented in the same manner as in the behavioral experiment so that the participants who have normal hearing ability can discriminate each voice. (1) 0 dB condition: The frequency gain was not processed as an easily audible voice.
- FIG. 7 is a flowchart showing the procedure for one trial.
- the same blocks as those in FIG. 3 are denoted by the same reference numerals, and the description thereof is omitted.
- the difference from FIG. 3 is that there is no step S13 to step S16, and the experiment participant is not required to perform an explicit action.
- FIG. 8 shows a waveform obtained by summing and averaging event-related potentials in Pz based on voice presentation based on the degree of confidence of discrimination. The addition average was performed based on the degree of confidence of discrimination for each participant and each word sound in all the conditions (0 dB ⁇ ⁇ 25 dB ⁇ ⁇ 50 dB) of the behavior experiment.
- the horizontal axis is time and the unit is ms
- the vertical axis is potential and the unit is ⁇ V.
- the downward direction of the graph corresponds to positive (positive) and the upward direction corresponds to negative (negative).
- the baseline was aligned so that the average potential from -100 ms to 0 ms would be zero.
- the solid line shown in FIG. 8 is the addition average waveform of the event-related potential at the electrode position Pz when the discrimination confidence level is high in the behavioral experiment, and the broken line is when the discrimination confidence level is low. According to FIG. 8, it can be seen that a positive component appears in the latency from 700 ms to 900 ms in the solid line with low discrimination confidence level compared to the broken line indicating that the discrimination confidence level is high.
- the average potential between 700 ms and 900 ms for each participant was ⁇ 0.47 ⁇ V when the confidence level of discrimination was high and 0.13 ⁇ V when the confidence level was low.
- the section average potential was significantly large when the confidence level of discrimination was low (p ⁇ 0.05).
- the inventors of the present application say that the event-related potential from the latency of 700 ms to 900 ms reflects the discrimination confidence level starting from the voice presentation, and the potential can be used as an indicator of the discrimination confidence level.
- the conclusion was drawn.
- the time periods in which the significant difference due to the difference in the degree of confidence of discrimination persisted for 30 ms or more were only 730 ms to 770 ms and 840 ms to 915 ms.
- FIG. 9 is a diagram showing, for each confidence level, the segment average potential from 700 ms to 900 ms of the event-related potential starting from voice presentation at the electrode positions C3, Cz, C4.
- the black circle line shown in FIG. 9 indicates the section average potential when the discrimination confidence level is high, and the white circle line indicates the section average potential when the discrimination confidence level is low.
- the event-related potential is positive when the discrimination confidence level is high, and the event-related potential is negative when it is low. Focusing on the polarity of the event-related potential, it can be seen that the polarity is inverted between the measurement at the electrode position Pz (FIG. 8) and the measurement at the electrode position Cz (FIG. 9). In the general P300 component, the polarity hardly reverses at the electrode positions Cz and Pz. Therefore, there is a possibility that the positive component induced at the electrode position Pz is a component different from the P300 component when the discrimination confidence is low. high. “P300 component” means “New Physiological Psychology Vol. 2” (supervised by Miyata, Kitaoji Shobo, 1997), p. 14 Is a positive component of the event-related potential.
- the black circle line that is the section average potential when the discrimination confidence level is high and the white circle line that is the section average potential when the discrimination confidence level is low are different.
- the potential distribution pattern was significantly different (p ⁇ 0.05). Accordingly, it can be said that the degree of confidence of discrimination can be determined using the potential distribution pattern at the electrode positions C3, Cz, and C4. Since the electrode positions C3, Cz, and C4 are portions where the headband of the overhead type headphone and the head are in contact with each other, it is easy to mount the electrode when performing speech intelligibility evaluation using the headphone.
- the positive component (FIG. 8) having a latency of 700 ms to 900 ms at the electrode position Pz and the characteristic component (FIG. 9) having a latency of 700 ms to 900 ms at the electrode positions C3, C4, and Cz can be identified by various methods. For example, a method of thresholding the magnitude of the peak amplitude in the vicinity of the latency of about 700 ms, a method of creating a template from a typical waveform of the above component, and calculating a similarity to the template can be used.
- the threshold value / template may be a typical user's previously stored or may be created for each individual.
- the data of the six participants is checked 40 times for each discrimination confidence level. Averaging averaged by degree.
- positive components can be identified by non-addition or a few additions of several times by devising a feature amount extraction method (for example, wavelet transform of a waveform) or an identification method (for example, support vector machine learning).
- the time after a predetermined time elapsed from a certain time point in order to define the event-related potential component is expressed as, for example, “latency 700 ms to 900 ms”. This means that a range centered on a specific time from 700 ms to 900 ms can be included.
- EMP Event-Related Potential
- the terms “about Xms” and “near Xms” mean that a width of 30 to 50 ms can exist around the Xms (for example, 300 ms ⁇ 30 ms, 700 ms ⁇ 50 ms).
- width from 30 ms to 50 ms is an example of a general individual difference of the P300 component.
- the positive component of the latency 700 ms to 900 ms has a slower latency than the P300, so the individual difference of the user is different. It appears even bigger. Therefore, it is preferable to handle it as a wider width, for example, a width of about 100 ms.
- FIG. 10 shows the correspondence between the presence / absence of a positive component, the degree of confidence of discrimination, and the ease of hearing, which are summarized by the inventors of the present application. This correspondence is created by taking the positive component at the electrode position Pz as an example.
- the speech intelligibility evaluation system sequentially presents speech of single syllables by voice, and implements speech evaluation by using presence / absence of positive components in the event-related potential latency from 700 ms to 900 ms starting from the speech presentation.
- This is a speech sound intelligibility evaluation system that is realized for the first time based on the above-mentioned two discoveries by the inventors of the present application and without a user's answer input.
- Embodiment 1 a first embodiment of a speech intelligibility evaluation system using a positive component reflecting the degree of confidence of discrimination will be described.
- the event-related potentials are measured starting from each of the voice presentation times in order by sequentially presenting the voices, detecting the characteristic component of the latency from 700 ms to 900 ms that appears when the confidence level of voice discrimination is low, and listening to the speech.
- the outline of the speech intelligibility evaluation system to be evaluated will be described. Thereafter, the configuration and operation of the speech intelligibility evaluation system including the speech intelligibility evaluation device will be described.
- the exploration electrode (sometimes referred to as a measurement electrode) is installed at the position Pz on the top of the head, the reference electrode is installed on either the left or right mastoid, and the potential difference between the exploration electrode and the reference electrode. EEG was measured.
- the reference electrode may be an earlobe as long as it is in the vicinity of the ear, or may be a portion that contacts an ear pad of headphones or glasses.
- the level and polarity of the characteristic component of the event-related potential vary depending on the part where the electrode for electroencephalogram measurement is attached and how to set the reference electrode and the exploration electrode.
- those skilled in the art will perform appropriate modifications according to the setting method of the reference electrode and the exploration electrode at that time, detect characteristic components of event-related potentials, and evaluate speech intelligibility. Is possible. Such modifications are within the scope of the present invention.
- FIG. 11 shows the configuration and usage environment of the speech intelligibility evaluation system 100 according to this embodiment. This speech intelligibility evaluation system 100 is illustrated corresponding to the system configuration of Embodiment 1 described later.
- the speech intelligibility evaluation system 100 includes a speech intelligibility evaluation device 1, an audio output unit 11, and a biological signal measurement unit 50.
- the biological signal measuring unit 50 is connected to at least two electrodes A and B.
- the electrode A is attached to the mastoid of the user 5, and the electrode B is attached to a position on the scalp of the user 5 (so-called Pz).
- the speech intelligibility evaluation system 100 presents a single syllable speech to the user 5 as a voice, and determines whether there is a positive component with a latency of 700 ms to 900 ms in the brain wave (event-related potential) of the user 5 measured from the voice presentation time. judge.
- latency 700 ms to 900 ms means a latency of 700 ms to 900 ms including a boundary between 700 ms and 900 ms. Then, based on the presence of the presented voice and the positive component, the speech intelligibility evaluation is automatically realized without the user 5 answer input.
- the brain wave of the user 5 is acquired by the biological signal measuring unit 50 based on the potential difference between the electrode A and the electrode B.
- the biological signal measurement unit 50 transmits information corresponding to the potential difference to the speech intelligibility evaluation device 1 wirelessly or by wire.
- FIG. 11 illustrates an example in which the biological signal measurement unit 50 transmits the information to the speech intelligibility evaluation device 1 wirelessly.
- the speech intelligibility evaluation device 1 performs sound pressure control of speech for speech intelligibility evaluation and control of voice and character presentation timing, and is provided to the user 5 via an audio output unit 11 (for example, a speaker). Present audio.
- FIG. 12 shows a hardware configuration of the speech intelligibility evaluation apparatus 1 according to this embodiment.
- the speech sound intelligibility evaluation apparatus 1 includes a CPU 30, a memory 31, and an audio controller 32. These are connected to each other by a bus 34 and can exchange data with each other.
- the CPU 30 executes a computer program 35 stored in the memory 31.
- the computer program 35 describes a processing procedure shown in a flowchart described later.
- the speech intelligibility evaluation apparatus 1 performs processing for controlling the entire speech intelligibility evaluation system 100 using the speech database (DB) 71 stored in the same memory 31 in accordance with the computer program 35. This process will be described in detail later.
- DB speech database
- the audio controller 32 generates a sound to be presented according to a command from the CPU 30 and outputs the sound to the generated sound signal sound output unit 11.
- the speech intelligibility evaluation device 1 may be realized as hardware such as a DSP in which a computer program is incorporated in one semiconductor circuit.
- a DSP can realize all the functions of the CPU 30, the memory 31, and the audio controller 32 described above with a single integrated circuit.
- the computer program 35 described above can be recorded on a recording medium such as a CD-ROM and distributed as a product to the market, or can be transmitted through an electric communication line such as the Internet.
- the apparatus for example, PC
- the apparatus provided with the hardware shown in FIG. 12 can function as the speech intelligibility evaluation apparatus 1 according to the present embodiment by reading the computer program 35.
- the speech sound DB 71 may not be held in the memory 31 and may be stored in, for example, a hard disk (not shown) connected to the bus 34.
- FIG. 13 shows a functional block configuration of the speech intelligibility evaluation system 100 according to the present embodiment.
- the speech intelligibility evaluation system 100 includes an audio output unit 11, a biological signal measurement unit 50, and a speech intelligibility evaluation device 1.
- FIG. 13 also shows detailed functional blocks of the speech intelligibility evaluation apparatus 1. That is, the speech sound intelligibility evaluation apparatus 1 includes a positive component detection unit 60, a presented speech sound control unit 70, a speech sound DB 71, and a speech sound intelligibility evaluation unit 80. Note that the block of the user 5 is shown for convenience of explanation.
- Each function block (except for the speech sound DB 71) of the speech sound intelligibility evaluation apparatus 1 is sometimes changed as a whole by the CPU 30, the memory 31, and the audio controller 32 by executing the program described with reference to FIG. It corresponds to the function to be realized.
- the speech sound DB 71 is a speech sound database for evaluating speech intelligibility.
- FIG. 14 shows an example of the speech sound DB 71.
- data grouped according to a voice file to be presented, a consonant label, and a likelihood of occurrence of abnormal hearing (ease of occurrence of abnormal hearing) are associated.
- the type of the word sound to be stored may be the word sound listed in the 57S word table or 67S word table.
- the consonant label is used when the user 5 evaluates in which consonant the probability of occurrence of abnormal hearing is high.
- the grouping data is used when the user 5 evaluates in which group the probability of occurrence of abnormal hearing is high.
- the grouping is, for example, major classification, middle classification, or minor classification.
- the major classifications are vowels, unvoiced consonants, and voiced consonants.
- the middle classification is a classification within unvoiced consonants and voiced consonants.
- Sa line medium classification: 1 and Ta Ka Ha line (middle class: 2)
- La Ya Wa line within the voiced consonant
- La Ya Wa line within the voiced consonant
- Na Ma Ga It can be classified into The Da Ba Line (medium classification: 2).
- Minor classification can be classified into Na Ma line (minor classification: 1) and The Ga Da Ba line (minor classification: 2).
- the presented speech sound control unit 70 refers to the speech sound DB 71 and determines the speech sound to be presented.
- the presented speech sound control unit 70 may select and determine the speech in a random order, for example, or may receive and determine information on unevaluated / re-evaluated speech from the speech intelligibility evaluation unit 100.
- the presented speech sound control unit 70 may select a specific consonant or speech of a speech sound group in order to obtain information on which consonant or in which speech sound group the occurrence probability of abnormal hearing is high.
- the presented speech sound control unit 70 controls the speech output unit 11 to present the speech determined in this way to the user 5 by speech.
- the trigger and the content of the presentation voice are transmitted to the positive component detection unit 60 in accordance with the voice presentation time.
- the voice output unit 11 reproduces a single syllable voice designated by the presentation word sound control unit 70 and presents it to the user 5.
- the biological signal measuring unit 50 is an electroencephalograph that measures a biological signal of the user 5 and measures an electroencephalogram as a biological signal. It is assumed that the user 5 is wearing an electroencephalograph in advance.
- the electroencephalogram measurement electrode is attached to, for example, Pz at the top of the head.
- the positive component detection unit 60 receives the brain wave of the user 5 measured by the biological signal measurement unit 50. Then, the positive component detection unit 60 cuts out an event-related potential in a predetermined section (for example, a section from ⁇ 100 ms to 1000 ms) from the received brain wave with the trigger received from the presented word sound control unit 70 as a starting point.
- a predetermined section for example, a section from ⁇ 100 ms to 1000 ms
- the positive component detection unit 60 performs addition averaging of the extracted event-related potentials according to the content of the presented voice received from the presented word sound control unit 70.
- the positive component detection unit 60 may select only the same word sounds and perform addition averaging, or may select word sounds having the same consonant and perform addition averaging. Moreover, you may carry out for every major classification, middle classification, and minor classification of a group. When the averaging is performed only with the same word sound, it is possible to evaluate the discrimination for each word sound, and when the averaging is performed with the word sound having the same consonant, it is possible to evaluate which consonant has low intelligibility.
- the major classification, middle classification, and minor classification mean the classification described above with reference to FIG.
- the positive component detection unit 60 identifies the event-related potential and determines the presence or absence of a positive component with a latency of 700 ms to 900 ms.
- the positive component detection unit 60 identifies the presence or absence of a positive component by the following method. For example, the positive component detection unit 60 compares the maximum amplitude from the latency 700 ms to 900 ms and the section average potential from the latency 700 ms to 900 ms with a predetermined threshold. When the section average potential is larger than the threshold, it is identified as “with positive component”, and when it is smaller, it is identified as “without positive component”.
- the positive component detection unit 60 can obtain a similarity (for example, a correlation) between a predetermined template created from a typical positive component signal waveform having a latency of 700 ms to 900 ms and an event-related potential waveform having a latency of 700 ms to 900 ms.
- the case where the number is determined to be similar may be identified as “with positive component”, and the case where it is determined not to be similar may be identified as “without positive component”.
- the predetermined threshold value or template may be calculated / created from a waveform of a positive component of a general user held in advance, or may be calculated / created from a waveform of a positive component for each individual.
- the “positive component” generally means a voltage component of an event-related potential that is greater than 0 ⁇ V. However, in the present specification, the “positive component” does not need to be absolutely positive (greater than 0 ⁇ V). In the present specification, since the presence / absence of a “positive component” is identified in order to identify whether the discrimination confidence level is high or low, as long as the significant level of the discrimination confidence level can be discriminated, the section average potential is 0 ⁇ V or less. It may be. For example, in FIG. 8, there is a significant difference between about 700 ms and about 800 ms. At this time, the voltage value of the event-related potential is about 0 ⁇ V.
- a component of an event-related potential that can be used to identify the level of confidence of hearing is sometimes referred to as a “feature component”. Or more broadly, it may be referred to as a “component greater than a predetermined value” of the event-related potential.
- the speech intelligibility evaluation unit 80 receives information on the presence or absence of a positive component for each speech from the positive component detection unit 60.
- the speech intelligibility evaluation unit 100 evaluates the speech intelligibility based on the received information.
- Clarity is evaluated according to, for example, the rules shown in FIG. 10 and the presence or absence of positive components.
- FIGS. 15A, 15B, and 15C are examples in which the intelligibility for each word sound, each consonant, and each group is evaluated by addition averaging for each word sound, each consonant, and each group.
- the major classification is 0/1/2 for the vowel / unvoiced consonant / voiced consonant
- the middle is 1/2 for the unvoiced / voiced consonant
- the minor is for the minor classification.
- ⁇ It is shown as 1/2 in the classification of Ma Line / The Ga Da Ba Line. It becomes possible to evaluate with ⁇ / ⁇ for each word sound, each consonant, and each group.
- the speech intelligibility when the speech intelligibility is low, such as the speech “NA” in FIG. 15, whether the intelligibility of “NA” is low, the intelligibility of “NA” is low, or the entire “voiced consonant” It becomes clear whether the clarity is low.
- “ya” can be clearly discriminated like “ya”, but the potential low intelligibility can also be detected as in the case where “ya” is low in clarity.
- the probability of ⁇ evaluated that speech intelligibility is high may be calculated for each speech, and the calculated high intelligibility probability may be used as the final speech intelligibility evaluation.
- FIG. 16 is a flowchart illustrating a procedure of processing performed in the speech intelligibility evaluation system 100.
- step S101 the presented speech control unit 70 determines the speech of a single syllable to be presented with reference to the speech DB 71, and presents the speech to the user 5 via the speech output unit 11. Then, the presented word sound control unit 70 transmits the information of the presented voice and the trigger to the positive component detection unit 60.
- the presented speech sound control unit 70 may randomly select the speech to be presented from the speech sound DB 71, or may intensively select a specific consonant or a group of speech sounds.
- step S102 the positive component detection unit 60 receives a trigger from the presented word sound control unit 70, and among the electroencephalograms measured by the biological signal measurement unit 50, for example, the electroencephalogram from ⁇ 100 ms to 1000 ms starting from the trigger, that is, event-related Cut out the potential. Then, an average potential of ⁇ 100 ms to 0 ms is obtained, and the baseline of the obtained event-related potential is corrected so that the average potential becomes 0 ⁇ V.
- step S103 based on the information of the presentation word sound received from the presentation word sound control unit 70, the positive component detection unit 60 adds and averages the event-related potentials cut out in step S102. For example, the averaging is performed for each word sound, for each consonant, and for each group. The process returns to step S101 until the predetermined number of additions is obtained, and the voice presentation is repeated. “Procedure for returning from step S103 to step S101” indicates repetition of trials.
- step S104 the positive component detection unit 60 identifies the waveform of the event-related potential averaged in step S103, and determines the presence or absence of a positive component having a latency of 700 ms to 900 ms.
- the positive component may be identified by comparison with a threshold value or by comparison with a template.
- step S105 the speech intelligibility evaluation unit 100 performs speech intelligibility evaluation by receiving information on the presence / absence of the positive component obtained in step S104 from the positive component detection unit 60 for each speech, consonant, and group. Accumulate results.
- speech intelligibility evaluation device 1 in the present embodiment by realizing the speech intelligibility evaluation device 1 in the present embodiment with a size and weight that can be carried, speech intelligibility evaluation can be realized even in a sound environment in which a user uses a hearing aid.
- the description has been made on the assumption of Japanese speech intelligibility evaluation.
- a language other than Japanese such as English or Chinese
- a language other than Japanese such as English or Chinese
- a single syllable word as shown in FIG. 17 (a) may be presented and evaluation may be performed for each word, or as shown in FIG. 17 (b), for each phonetic symbol. You may evaluate.
- FIG. 17B words may be divided into groups based on the probability of occurrence of abnormal hearing and evaluated for each group.
- the speech intelligibility evaluation system 100 of the present embodiment it is not necessary to input an answer, and the speech intelligibility evaluation is realized simply by listening to the voice and thinking of the corresponding hiragana. Thereby, for example, the effort of the hearing aid user required for the evaluation in the speech intelligibility evaluation at a hearing aid dealer is significantly reduced.
- the audio output unit 11 is a speaker, but the audio output unit 11 may be a headphone.
- FIG. 18 shows the appearance of a headphone corresponding to the audio output unit 11. The use of headphones makes it easy to carry and enables the evaluation of speech intelligibility in the environment used by the user.
- an electroencephalograph corresponding to the biological signal measuring unit 50 may be incorporated together with the electrodes as in the headphones shown in FIG.
- An electrode Pz / Cz in contact with the position Pz or Cz is disposed in the headband portion designed to pass around the top of the head.
- a reference (reference) electrode and a ground electrode are disposed on the ear cushion where the speaker is disposed.
- An electroencephalograph (not shown) is provided in headphones such as an ear cushion or a headband unit. According to the present embodiment, the electroencephalogram measurement can be started by bringing the electrode Pz and the reference (reference) electrode / ground electrode into contact with the head and the periphery of the ear simultaneously with the wearing of the headphones.
- the polarity of Cz is opposite to the polarity of the electrode Pz. That is, if the confidence level of discrimination is low, it becomes negative, and if it is high, it becomes positive. Therefore, the event-related potential positive component (or component greater than or equal to the predetermined value) in the above description may be replaced with the event-related potential negative component (or component less than or equal to the predetermined value).
- Embodiment 2 In the speech intelligibility evaluation system 100 according to the first embodiment, the speech intelligibility for speech adjusted based on one kind of fitting method stored in the speech DB 71 in advance is checked for the presence or absence of a characteristic component of latency from 700 ms to 900 ms. It was evaluated with. The feature component is assumed to reflect voice and the degree of confidence of discrimination for voice presentation.
- the method based on the fitting theory is not yet well established, and several methods are mixed. Which fitting method is optimal differs for each user. Therefore, when the speech intelligibility is evaluated using a speech sound set adjusted based on a plurality of types of fitting methods instead of a speech sound set adjusted based on one type of fitting method, a result that is more suitable for each user Can be obtained.
- a speech intelligibility evaluation system that evaluates which fitting parameter is appropriate among a plurality of fitting parameters and searches for an optimal fitting method for each user will be described.
- Fitting is realized by adjusting the gain for each frequency based on the relationship between the shape of the audiogram and the threshold obtained by subjective reporting, UCL (uncomfortable level: a sound pressure level that is so uncomfortable for the user), and MCL.
- UCL uncomfortable level: a sound pressure level that is so uncomfortable for the user
- MCL MCL
- the types of fitting methods are, for example, the insertion gain of each frequency is half the minimum audible threshold for that frequency.
- the Berger method with a slightly increased amplification from 1000 Hz to 4000 Hz in consideration of the frequency band and level of conversational speech, and the gain of 250 Hz and 500 Hz with less speech information and more noise components.
- POGO method with 10 dB and 5 dB reduced respectively
- NAL-R method that amplifies the long-term acoustic analysis frequency of words to a comfortable level.
- audio data stored in the speech DB 72 is converted using several fitting methods so that an actual hearing aid performs. Then, a plurality of types of converted voices are presented to the user, and the fitting method is evaluated using the characteristic components that are triggered in relation to the degree of self-confidence based on the voice presentation. Conversion to a plurality of types of sounds is realized by adjusting the sound level for each frequency. For example, when the half gain method is used as the fitting method, the gain for each frequency is adjusted to be half of the minimum audible threshold based on the user's audiogram.
- FIG. 19 shows a functional block configuration of the speech intelligibility evaluation system 200 according to the present embodiment.
- the speech intelligibility evaluation system 200 includes an audio output unit 11, a biological signal measurement unit 50, and a speech intelligibility evaluation device 2.
- the same blocks as those in FIG. 13 are denoted by the same reference numerals, and the description thereof is omitted.
- the hardware configuration of the speech intelligibility evaluation device 2 is as shown in FIG.
- the speech intelligibility evaluation apparatus 2 By executing a program that defines processing different from the program 35 (FIG. 12) described in the first embodiment, the speech intelligibility evaluation apparatus 2 according to the present embodiment shown in FIG. 19 is realized.
- the audio output unit 11 and the biological signal measurement unit 50 according to the present embodiment are assumed to be realized by a headphone type shown in FIG.
- the exploration electrodes are arranged at, for example, Cz, C3, and C4, and the reference electrode is positioned on either the left or right side.
- the exploration electrode may be arranged at Pz and the reference electrode may be arranged around the ear.
- the speech intelligibility evaluation apparatus 2 is different from the speech intelligibility evaluation apparatus 1 according to Embodiment 1 in that a distinction confidence degree determination unit 61 is provided instead of the positive component detection unit 60, and the speech sound DB 71. Instead, a speech sound DB 72 is provided, and a speech technique intelligibility evaluation part 80 is replaced with a fitting technique switching part 90 and a fitting technique evaluation part 91.
- the discrimination confidence level determination unit 61 the speech sound DB 72, the fitting method switching unit 90, and the fitting method evaluation unit 91 will be described.
- the discrimination difference determination unit 61 which is the first difference, acquires an electroencephalogram from an electrode arranged at the headband position of the headphones. Then, the discrimination confidence level determination unit 61 cuts out event-related potentials from the brain wave as a starting point, adds and averages them, detects characteristic components that are triggered when the discrimination confidence level is low, and determines the discrimination confidence level.
- the waveform cutting method and the averaging method are the same as those of the positive component detection unit 60 in the speech intelligibility evaluation system 100.
- the characteristic component is detected as follows, for example.
- the discrimination confidence degree determination unit 61 compares the section average potential from the latency of 700 ms to 900 ms with a predetermined threshold value.
- the discrimination confidence determination unit 61 identifies “no feature component” when the section average potential is larger than the threshold value, and identifies “feature component present” when it is smaller.
- the “predetermined threshold value” may be calculated from the waveform of the characteristic component when the general user's discrimination confidence level is low, or may be calculated from the waveform of the characteristic component for each individual.
- the discrimination confidence determination unit 61 uses the results shown in FIG. 9 to store the latent event-related potentials obtained using the electrodes C3, Cz, and C4, respectively.
- a section average potential from 700 ms to 900 ms may be calculated, and the characteristic component may be detected based on the magnitude relation of the section average potential.
- the discrimination confidence determination unit 61 may determine that “there is a feature component” when the section average potential of the electrodes C3 and C4 is larger than the electrode Cz, and “no feature component” when it is smaller. .
- erroneous detection is reduced by making a determination based on the magnitude relationship of the section average potentials of the plurality of electrodes.
- the speech sound DB 72 which is the second difference from the first embodiment, is a speech sound database for selecting an optimal fitting technique.
- FIG. 20 shows an example of the speech sound DB 72.
- the difference between the speech sound DB 72 and the speech sound DB 71 shown in FIG. 14 is that the speech sound DB 72 holds a plurality of audio sets obtained by adjusting the measurement results of the user's audiogram based on a plurality of fitting methods.
- the sound sets 72a, 72b, and 72c are adjusted based on the fitting methods A, B, and C, respectively. In each voice set, the frequency gain of the word sound is adjusted according to the fitting method.
- the items for each fitting method of the speech sound DB 72 shown in FIG. 20 are based on the voice file to be presented, the consonant label, and the likelihood of occurrence of abnormal hearing (ease of occurrence of abnormal hearing), as in the speech sound DB 71 shown in FIG.
- the data is grouped.
- the type of the word sound to be stored may be the word sound listed in the 57S word table or 67S word table.
- the consonant label is used when the user 5 evaluates in which consonant the probability of occurrence of abnormal hearing is high.
- the grouping data is used when the user 5 evaluates in which group the probability of occurrence of abnormal hearing is high.
- the grouping is, for example, a major classification, a middle classification, and a minor classification, similar to the speech DB 71.
- the fitting method switching unit 90 which is the third difference from the first embodiment, refers to the speech sound DB 72, selects a fitting method in a regular or random order, and each frequency gain is adjusted by the selected fitting method. Get the voice of the voice.
- the fitting methods include the half gain method, the Berger method, the POGO method, the NAL-R method, and the like. Note that “selecting a fitting technique” is the same as selecting a plurality of sound sets stored in the speech sound DB 72. The voice of the speech in the acquired voice set is presented to the user 5 via the voice output unit 11.
- the fitting technique evaluation unit 91 uses, as the amplitude of the event-related potential starting from the voice presentation from the discrimination confidence determination unit 61, for example, information on the interval average potential of the latency from 700 ms to 900 ms. Receive information on fitting methods.
- fitting method evaluation part 91 determines the presence or absence of a positive component for every fitting method, for example for every speech sound, every consonant, and every speech sound group.
- FIG. 21 shows an example in which the speech intelligibility is evaluated for each speech as the speech intelligibility evaluation results in the fitting methods A to C, for example.
- the fitting method A is a half gain method
- the fitting method B is a Berger method
- the fitting method C is a POGO method.
- the fitting technique evaluation unit 91 compares the speech intelligibility evaluation results for each fitting technique.
- FIG. 22 shows an example of the evaluation result of the fitting method. This evaluation result is calculated based on the example of FIG. In FIG. 22, the fitting method A having a high probability is evaluated as “ ⁇ ” as the most suitable fitting method for the user 5 based on the probability of the word sound having high speech intelligibility, and the fitting method B having the low probability is not suitable for the user 5. An example in which “ ⁇ ” is evaluated is shown. The fitting method C, which is the second evaluation result, is indicated by “ ⁇ ”.
- the evaluation of the fitting method is determined to be “ ⁇ ”,“ ⁇ ”, or“ ⁇ ”according to the probability of the speech with high speech intelligibility. is there. If the optimum fitting method can be selected, the display method is arbitrary. In addition, a threshold value of probability may be determined in advance, and if the threshold value is exceeded, any hearing aid user may be notified that the fitting method is appropriate.
- FIG. 23 shows a processing procedure of the speech intelligibility system 200 according to the present embodiment.
- steps that perform the same process as the process of the speech intelligibility evaluation system 100 are denoted by the same reference numerals, and description thereof is omitted.
- the processing of the speech intelligibility evaluation system 200 according to the present embodiment is different from the processing of the speech intelligibility evaluation system 200 according to Embodiment 1 in that the presence / absence determination step S104 of the positive component and the speech are performed from 700 ms to 900 ms starting from the voice presentation. This is the point that step S201 to step S204 for evaluating the fitting method are newly added by omitting the intelligibility evaluation step S105.
- the fitting method switching unit 90 refers to the speech DB 72 and the audiogram of the user 5 measured in advance, and performs speech intelligibility evaluation from a plurality of speech sets adjusted by a plurality of fitting methods. Select a set.
- step S202 the discrimination confidence level determination unit 61 detects the presence or absence of a feature component that is caused when the discrimination confidence level is low, and determines the discrimination confidence level based on the detection result.
- the average potential of the interval from 700 ms to 900 ms is compared with a predetermined threshold value. If the measurement electrode is larger than the threshold value, “no feature component” is indicated. ". Further, for example, when the measurement electrodes are C3, Cz, C4, the section average potential of the latency 700 ms to 900 ms is calculated for each of C3, Cz, C4, and based on the magnitude relation of the section average potential at each part, When the section average potential of C3 and C4 is larger than Cz, it is determined that “there is a feature component”, and conversely, when it is smaller, “there is no feature component”.
- step S203 the fitting technique evaluation unit 91 calculates the probability of a speech sound having a high degree of confidence for each fitting technique based on the information on the identification confidence received from the identification confidence determination unit 61.
- step S204 the fitting technique evaluation unit 91 notifies the hearing aid user of the fitting technique with the highest probability based on the probability of the clear speech sound calculated in step S203 as the optimum fitting technique.
- the probability of clear speech is measured for each type of fitting method, and for each speech sound, consonant, and speech group of each fitting method. Discovery is possible. Thereby, evaluation of the fitting method is realized.
- speech intelligibility evaluation apparatus 2 in this embodiment can be carried, speech intelligibility evaluation can be realized even in a sound environment in which a user uses a hearing aid.
- an optimal fitting method can be easily and automatically selected for each user for the speech actually output by the hearing aid. This eliminates the need for exploratory fitting, and the time required for fitting is significantly reduced.
- the electrode position is, for example, Cz in the International 10-20 method.
- it is difficult for each user to specify exactly the electrode position of the head corresponding to the position of Cz. Therefore, it may be a position (Cz peripheral position) that seems to be Cz of the electrode. The same applies to the electrode position Pz and the like.
- the speech intelligibility evaluation system can be automatically evaluated, so that the user and infant who are physically handicapped can be evaluated. It can be used in hearing aid fitting not only for users who cannot speak or answer by pressing buttons, but also for all people.
Abstract
Description
本発明によるコンピュータプログラムは、コンピュータによって実行されるコンピュータプログラムであって、前記コンピュータプログラムは、前記コンピュータに対し、計測されたユーザの脳波信号を受け取るステップと、単音節の語音を複数保持している語音データベースを参照して呈示する語音を決定するステップと、決定された前記語音を、音声で呈示するステップと、計測された前記ユーザの脳波信号から、前記音声が呈示された時刻を起点として800ms±100msにおける事象関連電位の特徴成分の有無を判定するステップと、判定結果に基づき、前記ユーザが前記語音を聞き取れたか否かを判定するステップとを実行させる。
本願発明者らは、音声の聞き分けに関する自信度と異聴発生確率との関係を調べるために、行動実験を実施した。以下、図1から図3を参照しながら、実施した行動実験の実験設定および実験結果を説明する。
本願発明者らは、音声の聞き分け自信度と音声呈示後の事象関連電位との関係を調べるために、脳波計測実験を実施した。以下、図5から図9を参照しながら、実施した脳波計測実験の実験設定および実験結果を説明する。
(1)0dB条件:聞き分けやすい音声として周波数ゲインの加工をしなかった。
(2)-25dB条件:250Hz-16kHzの周波数のゲインを段々と-25dBまで調整(低減)した。
(3)-50dB条件:250Hz-16kHzの周波数のゲインを段々と-50dBまで調整(低減)した。
以下、聞き分け自信度を反映した陽性成分を用いた語音明瞭度評価システムの第1の実施形態を説明する。
図11は、本実施形態による語音明瞭度評価システム100の構成および利用環境を示す。この語音明瞭度評価システム100は後述する実施形態1のシステム構成に対応させて例示している。
次に、図16を参照しながら図13の語音明瞭度評価システム100において行われる全体的な処理を説明する。図16は、語音明瞭度評価システム100において行われる処理の手順を示すフローチャートである。
実施形態1による語音明瞭度評価システム100では、語音DB71に保存されたあらかじめ1種類のフィッティング手法に基づいて調整された音声に対する語音明瞭度を、潜時700msから900msの特徴成分の有無を調べることで評価した。そして、この特徴成分は、音声を呈示し音声呈示に対する聞き分け自信度を反映しているとした。
1、2 語音明瞭度評価装置
11 音声出力部
50 生体信号計測部
60 陽性成分検出部
61 聞き分け自信度判定部
70 呈示語音制御部
71 語音DB
72 語音DB
80 語音明瞭度評価部
90 フィッティング手法切替部
91 フィッティング手法評価部
100、200 語音明瞭度評価システム
Claims (10)
- ユーザの脳波信号を計測する生体信号計測部と、
単音節の語音を複数保持している語音データベースを参照して呈示する語音を決定する呈示語音制御部と、
前記呈示語音制御部が決定した語音を、音声で呈示する音声出力部と、
前記生体信号計測部で計測された前記ユーザの脳波信号から、前記音声が呈示された時刻を起点として800ms±100msにおける事象関連電位の特徴成分の有無を判定する特徴成分検出部と、
前記特徴成分検出部の判定結果に基づき、前記ユーザが前記語音を聞き取れたか否かを判定する語音明瞭度評価部と
を備えた語音明瞭度評価システム。 - 前記事象関連電位が国際10-20法における電極位置Pzを利用して取得され、かつ、前記事象関連電位に所定値以上の成分が存在すると、前記特徴成分検出部によって前記事象関連電位に特徴成分が存在すると判定されるときにおいて、
前記特徴成分検出部が、前記事象関連電位には前記特徴成分が存在しないと判定したときは、前記語音明瞭度評価部は前記ユーザが前記語音を聞き取れたと判定し、
前記特徴成分検出部が、前記事象関連電位には前記特徴成分が存在すると判定したときは、前記語音明瞭度評価部は前記ユーザが前記語音を聞き取れなかったと判定する、請求項1に記載の語音明瞭度評価システム。 - 前記事象関連電位が、国際10-20法における電極位置Czを利用して取得され、かつ、前記事象関連電位に所定値以下の成分が存在すると、前記特徴成分検出部によって前記事象関連電位に特徴成分が存在すると判定されるときにおいて、
前記特徴成分検出部が、前記事象関連電位には前記特徴成分が存在しないと判定したときは、前記語音明瞭度評価部は前記ユーザが前記語音を聞き取れたと判定し、
前記特徴成分検出部が、前記事象関連電位には前記特徴成分が存在すると判定したときは、前記語音明瞭度評価部は前記ユーザが前記語音を聞き取れなかったと判定する、請求項1に記載の語音明瞭度評価システム。 - 前記語音データベースは、複数の語音の各々について、音声、子音情報および異聴発生確率に関するグループを対応付け記憶している、請求項2または3に記載の語音明瞭度評価システム。
- 前記語音明瞭度評価部は、語音ごと、子音ごと、または、前記異聴発生確率に関するグループごとに語音明瞭度を評価する、請求項4に記載の語音明瞭度評価システム。
- 前記語音データベースは複数のフィッティング手法で周波数ゲインが調整された複数の音声セットを保存しており、
前記語音データベースに保存された音声セットを規則的またはランダムに切り換えて選択することにより、前記複数のフィッティング手法のうちの一つを選択するフィッティング手法切替部をさらに備えた、請求項1に記載の語音明瞭度評価システム。 - 前記音声出力部が、前記フィッティング手法切替部によって選択された音声セット内の語音を音声で呈示した場合に、
前記語音明瞭度評価部は、前記語音を聞き取れたか否かの判定結果を、前記複数のフィッティング方法ごとに比較し、前記語音を聞き取れたと判定された確率が高い場合に前記ユーザに適したフィッティング方法であると判定する、請求項6に記載の語音明瞭度評価システム。 - 単音節の語音を複数保持している語音データベースを参照して呈示する語音を決定する呈示語音制御部と、
前記呈示語音制御部が決定した語音を、音声で呈示する音声出力部と、
ユーザの脳波信号を計測する生体信号計測部で計測された前記ユーザの脳波信号から、前記音声が呈示された時刻を起点として800ms±100msにおける事象関連電位の特徴成分の有無を判定する特徴成分検出部と、
前記特徴成分検出部の判定結果に基づき、前記ユーザが前記語音を聞き取れたか否かを判定する語音明瞭度評価部と
を備えた語音明瞭度評価システム。 - ユーザの脳波信号を計測するステップと、
単音節の語音を複数保持している語音データベースを参照して呈示する語音を決定するステップと、
決定された前記語音を、音声で呈示するステップと、 計測された前記ユーザの脳波信号から、前記音声が呈示された時刻を起点として800ms±100msにおける事象関連電位の特徴成分の有無を判定するステップと、
判定結果に基づき、前記ユーザが前記語音を聞き取れたか否かを判定するステップと
を包含する、語音明瞭度評価方法。 - コンピュータによって実行されるコンピュータプログラムであって、
前記コンピュータプログラムは、前記コンピュータに対し、
計測されたユーザの脳波信号を受け取るステップと、
単音節の語音を複数保持している語音データベースを参照して呈示する語音を決定するステップと、
決定された前記語音を、音声で呈示するステップと、
計測された前記ユーザの脳波信号から、前記音声が呈示された時刻を起点として800ms±100msにおける事象関連電位の特徴成分の有無を判定するステップと、
判定結果に基づき、前記ユーザが前記語音を聞き取れたか否かを判定するステップと
を実行させる、語音明瞭度を評価するためのコンピュータプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010545704A JP4690507B2 (ja) | 2009-07-03 | 2010-07-02 | 語音明瞭度評価システム、その方法およびそのプログラム |
CN201080003119.4A CN102202570B (zh) | 2009-07-03 | 2010-07-02 | 语音清晰度评价系统、其方法 |
US13/037,479 US8655440B2 (en) | 2009-07-03 | 2011-03-01 | System and method of speech sound intelligibility assessment, and program thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009-159105 | 2009-07-03 | ||
JP2009159105 | 2009-07-03 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/037,479 Continuation US8655440B2 (en) | 2009-07-03 | 2011-03-01 | System and method of speech sound intelligibility assessment, and program thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011001693A1 true WO2011001693A1 (ja) | 2011-01-06 |
Family
ID=43410779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/004358 WO2011001693A1 (ja) | 2009-07-03 | 2010-07-02 | 語音明瞭度評価システム、その方法およびそのプログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US8655440B2 (ja) |
JP (1) | JP4690507B2 (ja) |
CN (1) | CN102202570B (ja) |
WO (1) | WO2011001693A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013030943A (ja) * | 2011-07-27 | 2013-02-07 | Kyocera Corp | 携帯電子機器 |
JP2018503481A (ja) * | 2014-12-08 | 2018-02-08 | マイブレイン テクノロジーズ | 生体信号を取得するためのヘッドセット |
US10835179B2 (en) | 2014-12-08 | 2020-11-17 | Mybrain Technologies | Headset for bio-signals acquisition |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012108128A1 (ja) | 2011-02-10 | 2012-08-16 | パナソニック株式会社 | 脳波記録装置、補聴器、脳波記録方法およびそのプログラム |
CN103135751A (zh) * | 2011-11-30 | 2013-06-05 | 北京德信互动网络技术有限公司 | 基于声控的智能电子设备和声控方法 |
CN103366759A (zh) * | 2012-03-29 | 2013-10-23 | 北京中传天籁数字技术有限公司 | 语音数据的测评方法和装置 |
CN103054586B (zh) * | 2012-12-17 | 2014-07-23 | 清华大学 | 一种基于汉语言语测听动态词表的汉语言语自动测听方法 |
CN103892828A (zh) * | 2012-12-26 | 2014-07-02 | 光宝电子(广州)有限公司 | 脑波感测装置 |
WO2014130571A1 (en) * | 2013-02-19 | 2014-08-28 | The Regents Of The University Of California | Methods of decoding speech from the brain and systems for practicing the same |
JP2014176582A (ja) * | 2013-03-15 | 2014-09-25 | Nitto Denko Corp | 聴力検査装置、聴力検査方法および聴力検査用単語作成方法 |
US10037712B2 (en) * | 2015-01-30 | 2018-07-31 | Toyota Motor Engineering & Manufacturing North America, Inc. | Vision-assist devices and methods of detecting a classification of an object |
US10372755B2 (en) | 2015-09-23 | 2019-08-06 | Motorola Solutions, Inc. | Apparatus, system, and method for responding to a user-initiated query with a context-based response |
US11868354B2 (en) * | 2015-09-23 | 2024-01-09 | Motorola Solutions, Inc. | Apparatus, system, and method for responding to a user-initiated query with a context-based response |
EP3203472A1 (en) * | 2016-02-08 | 2017-08-09 | Oticon A/s | A monaural speech intelligibility predictor unit |
AU2016423667B2 (en) | 2016-09-21 | 2020-03-12 | Motorola Solutions, Inc. | Method and system for optimizing voice recognition and information searching based on talkgroup activities |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0739540A (ja) * | 1993-07-30 | 1995-02-10 | Sony Corp | 音声解析装置 |
JP2002346213A (ja) * | 2001-05-30 | 2002-12-03 | Yamaha Corp | 聴力測定機能を持つゲーム装置およびゲームプログラム |
JP2003319497A (ja) * | 2002-04-26 | 2003-11-07 | Matsushita Electric Ind Co Ltd | 検査センター装置、端末装置、聴力補償方法、聴力補償方法プログラム記録媒体、聴力補償方法のプログラム |
WO2006003901A1 (ja) * | 2004-07-02 | 2006-01-12 | Matsushita Electric Industrial Co., Ltd. | 生体信号利用機器およびその制御方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH06114038A (ja) | 1992-10-05 | 1994-04-26 | Mitsui Petrochem Ind Ltd | 聴覚検査・訓練装置 |
JPH0938069A (ja) | 1995-08-02 | 1997-02-10 | Nippon Telegr & Teleph Corp <Ntt> | 語音聴力検査方法およびこの方法を実施する装置 |
US7399282B2 (en) * | 2000-05-19 | 2008-07-15 | Baycrest Center For Geriatric Care | System and method for objective evaluation of hearing using auditory steady-state responses |
US6602202B2 (en) * | 2000-05-19 | 2003-08-05 | Baycrest Centre For Geriatric Care | System and methods for objective evaluation of hearing using auditory steady-state responses |
US8165687B2 (en) * | 2008-02-26 | 2012-04-24 | Universidad Autonoma Metropolitana, Unidad Iztapalapa | Systems and methods for detecting and using an electrical cochlear response (“ECR”) in analyzing operation of a cochlear stimulation system |
-
2010
- 2010-07-02 CN CN201080003119.4A patent/CN102202570B/zh active Active
- 2010-07-02 WO PCT/JP2010/004358 patent/WO2011001693A1/ja active Application Filing
- 2010-07-02 JP JP2010545704A patent/JP4690507B2/ja active Active
-
2011
- 2011-03-01 US US13/037,479 patent/US8655440B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0739540A (ja) * | 1993-07-30 | 1995-02-10 | Sony Corp | 音声解析装置 |
JP2002346213A (ja) * | 2001-05-30 | 2002-12-03 | Yamaha Corp | 聴力測定機能を持つゲーム装置およびゲームプログラム |
JP2003319497A (ja) * | 2002-04-26 | 2003-11-07 | Matsushita Electric Ind Co Ltd | 検査センター装置、端末装置、聴力補償方法、聴力補償方法プログラム記録媒体、聴力補償方法のプログラム |
WO2006003901A1 (ja) * | 2004-07-02 | 2006-01-12 | Matsushita Electric Industrial Co., Ltd. | 生体信号利用機器およびその制御方法 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013030943A (ja) * | 2011-07-27 | 2013-02-07 | Kyocera Corp | 携帯電子機器 |
JP2018503481A (ja) * | 2014-12-08 | 2018-02-08 | マイブレイン テクノロジーズ | 生体信号を取得するためのヘッドセット |
US10835179B2 (en) | 2014-12-08 | 2020-11-17 | Mybrain Technologies | Headset for bio-signals acquisition |
Also Published As
Publication number | Publication date |
---|---|
US20110152708A1 (en) | 2011-06-23 |
CN102202570A (zh) | 2011-09-28 |
JPWO2011001693A1 (ja) | 2012-12-13 |
JP4690507B2 (ja) | 2011-06-01 |
US8655440B2 (en) | 2014-02-18 |
CN102202570B (zh) | 2014-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4690507B2 (ja) | 語音明瞭度評価システム、その方法およびそのプログラム | |
JP4638558B2 (ja) | 語音明瞭度評価システム、その方法およびそのコンピュータプログラム | |
JP4769336B2 (ja) | 補聴器の調整装置、方法およびプログラム | |
JP5144835B2 (ja) | うるささ判定システム、装置、方法およびプログラム | |
JP5002739B2 (ja) | 聴力判定システム、その方法およびそのプログラム | |
JP5144836B2 (ja) | 語音聴取の評価システム、その方法およびそのプログラム | |
JP4838401B2 (ja) | 語音明瞭度評価システム、その方法およびそのプログラム | |
JP5042398B1 (ja) | 脳波記録装置、補聴器、脳波記録方法およびそのプログラム | |
WO2013001836A1 (ja) | 不快閾値推定システム、方法およびそのプログラム、補聴器調整システムおよび不快閾値推定処理回路 | |
JPWO2013161235A1 (ja) | 語音弁別能力判定装置、語音弁別能力判定システム、補聴器利得決定装置、語音弁別能力判定方法およびそのプログラム | |
Niemczak et al. | Informational masking effects on neural encoding of stimulus onset and acoustic change | |
Vestergaard et al. | Auditory size-deviant detection in adults and newborn infants | |
KR102310542B1 (ko) | 일음절어를 이용한 청력 검사 장치, 방법 및 프로그램 | |
Kurkowski et al. | Phonetic Audiometry and its Application in the Diagnosis of People with Speech Disorders | |
Benefit | Evidence Supports the Advantages of Signia AX's Split Processing | |
JP2006230633A (ja) | 優位脳の検査方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080003119.4 Country of ref document: CN |
|
ENP | Entry into the national phase |
Ref document number: 2010545704 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10793865 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10793865 Country of ref document: EP Kind code of ref document: A1 |