首頁(yè) 資訊語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究

語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究

來(lái)源：泰然健康網(wǎng) 時(shí)間：2024年11月24日 02:34

QQ客服官方微博反饋留言 語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究 其他題名Key Technologies of Detecting Depression with Voice Features 潘瑋 2020-05 摘要

抑郁癥是一種以抑郁癥狀為核心并伴隨大量其他癥狀的精神疾病。目前診斷以主觀為主，而客觀的評(píng)估工具對(duì)促進(jìn)抑郁癥的更加快速和準(zhǔn)確的治療尤為重要。語(yǔ)音數(shù)據(jù)臨床容易獲取，但是語(yǔ)音與抑郁癥二者之間還存在以下問(wèn)題:語(yǔ)音特征是否顯著預(yù)測(cè)抑郁癥，納人混淆變量一一人口學(xué)信息后，語(yǔ)音對(duì)抑郁癥預(yù)測(cè)的貢獻(xiàn)大小;語(yǔ)音特征能否區(qū)分是否抑郁;二者關(guān)聯(lián)是否跨情境跨情緒穩(wěn)定;以及語(yǔ)音特征是否能夠在復(fù)雜臨床診斷情境中保持高鑒別力。

研究一通過(guò)二元邏輯回歸模型調(diào)查語(yǔ)音特征與抑郁癥之間的關(guān)聯(lián)是否顯著。并納人人口統(tǒng)計(jì)學(xué)信息，將其對(duì)預(yù)測(cè)是否抑郁的貢獻(xiàn)作為基線水平。本研究收集584抑郁癥患者和548名健康人的語(yǔ)音數(shù)據(jù)。結(jié)果發(fā)現(xiàn)，有四種語(yǔ)音特征對(duì)抑郁癥預(yù)測(cè)起到了主要貢獻(xiàn):PC1 (OR=0.58, P <0.0001) , PC6 (OR=1.57, P <0.001) ,PC17 ( OR = 1. 53 , P <0.0001)和PC24 ( OR = 1.45 , P <0. 05 )。語(yǔ)音特征對(duì)抑郁癥的單獨(dú)貢獻(xiàn)達(dá)到了35.65% (Nagelkerke's R2)。

研究二設(shè)立三種分類模型:單獨(dú)基于語(yǔ)音的模型;單獨(dú)基于人口學(xué)變量的模型;基于語(yǔ)音與人口學(xué)變量的模型。同時(shí)該研究納人了其他數(shù)據(jù)集作為測(cè)試集以便說(shuō)明模型的泛化能力。本研究包含三個(gè)語(yǔ)音數(shù)據(jù)集，數(shù)據(jù)集一同研究一，用于分類模型構(gòu)建。數(shù)據(jù)集二包含500名抑郁癥患者，404名健康人。數(shù)據(jù)集三包含45名抑郁癥患者與58名健康人。結(jié)果發(fā)現(xiàn)，與以人口學(xué)變量建立的抑郁癥分類預(yù)測(cè)模型相比，包含語(yǔ)音的模型(單獨(dú)基于語(yǔ)音的模型;基于語(yǔ)音和人口學(xué)變量建立的模型)一致的達(dá)到了較高的分類準(zhǔn)確性(F-measure)。在其他數(shù)據(jù)集上進(jìn)行測(cè)試，得到的結(jié)果也是一致的。在該研究中，語(yǔ)音特征單獨(dú)預(yù)測(cè)模型在不同測(cè)試集上的分類準(zhǔn)確性均達(dá)到80% 。

研究三收集了45名抑郁癥患者與58名健康人的語(yǔ)音數(shù)據(jù)。研究采用了3(情緒狀態(tài):正性，中性，負(fù)性)*3(任務(wù)類型:語(yǔ)言問(wèn)答，文本朗讀，圖片描述)的實(shí)驗(yàn)設(shè)計(jì)，運(yùn)用機(jī)器學(xué)習(xí)分類算法一一邏輯回歸(Logistic Regression, LR)來(lái)構(gòu)建抑郁識(shí)別模型。實(shí)驗(yàn)結(jié)果表明，語(yǔ)音對(duì)不同情境下不同情緒狀態(tài)下的AUC值均在0.6以上(65.7-80.9)，語(yǔ)音的抑郁識(shí)別準(zhǔn)確性可以達(dá)到82.9% o

研究四設(shè)定了三種不同的分類任務(wù):1)對(duì)健康與非健康組進(jìn)行分類;2)對(duì)健康組與各種精神疾病進(jìn)行分類;3)對(duì)精神疾病兩兩分類。匹配后有32名躁郁癥患者，抑郁癥患者106例，健康患者114例，精神分裂癥患者20例。從語(yǔ)音中提取MFCC特征并抽取i-vectors。邏輯回歸模型評(píng)估結(jié)果顯示:分類抑郁癥和雙相障礙的模型AUC值為0.5 (F-score=0.44 )。對(duì)于其他分類任務(wù)，AUC值均在0.75到0.92之間(F-score:0.73~0.91)。在模型性能的比較上，差異檢驗(yàn)發(fā)現(xiàn)，抑郁癥和雙相障礙分類模型的性能(AUC )顯著差于針對(duì)雙相障礙與精神分裂癥的分類模型(corrected P < 0.05 )。其他分類任務(wù)模型好壞差異不顯著。而語(yǔ)音特征對(duì)抑郁癥和雙相障礙的分類效果不理想。

本研究對(duì)語(yǔ)音特征與抑郁癥的關(guān)系進(jìn)行了系統(tǒng)的探討，說(shuō)明了以下幾點(diǎn):(1)語(yǔ)音特征能夠顯著預(yù)測(cè)抑郁癥，語(yǔ)音對(duì)抑郁癥具有可觀的貢獻(xiàn);(2)語(yǔ)音特征能夠?qū)嶋H預(yù)測(cè)抑郁癥，模型具有一定的泛化能力;(3)語(yǔ)音的預(yù)測(cè)作用是跨情境跨情緒穩(wěn)定的;(4)語(yǔ)音能夠在精神疾病臨床診斷的復(fù)雜情境下具有較高的鑒別能力。這些關(guān)鍵技術(shù)研究為進(jìn)一步探究語(yǔ)音作為臨床抑郁癥診斷工具的可能性奠定了堅(jiān)實(shí)的基礎(chǔ)。

其他摘要

Depression is characterized with depressed mood and other complicated symptoms, which makes it particularly important to find a more objective assessment tool to promote faster and more accurate treatment of depression. Speech data is easily accessible clinically, and the research between speech and depression is still problematic in following aspects: whether speech features can significantly predict depression, and to what extent speech contributes predicting depression, comparing to confounding factors-demographic information; whether speech features can successfully classify depression or not in practice; can speech features predict depression across different contexts and emotions; and whether speech features can maintain good discrimination power in complex clinical diagnostic situations.

In study 1，demographic information is included, whose contribution to predicting depression is taken as the baseline. This study collected speech data from 584 depression patients and 548 healthy people. Results showed that there are multiple speech characteristics for depression PC1 (OR=0.58, P <0.0001), PC6 (OR=1.57, P<0.001), PC17 (0R=1.53, P <0.0001), and PC24 (OR=1.45,P <0.05). Speech features alone contributed to depression with an amount of 35.65% (Nagelkerke's R2).

Study 2 established three classification models: independent speech-based models; separate demographic-based models; and models based on both speech and demographic variables. The study contains three datasets. Dataset 1 is the same as in study 1 for classification model construction and testing. Dataset 2 contains 500 depression patients, 404 healthy people. And dataset 3 contains 45 depression patients and 58 healthy people. It showed that, compared with demographic variables, models including speech all reached generally higher predicting accuracy (F -measure). Even when tested on other data sets, the results are consistent. Voice input only model tested on different test sets all reach 80%.

Study 3 collected 45 depressed patients and 58 healthy people. The research adopted 3 (emotional state: positive, neutral, negative)*3 (task type: question and answer，text reading, picture description) experimental design. With classification algorithm-Logistic Regression. Results found that the average value of AUC of speech in different situations and different emotional states is universally above 0.6 (65.7-80.9), and the accuracy of depression recognition of speech reached 82.9%.Study 4 set up 3 type classification tasks: 1) classifying healthy and non-healthy groups; 2) classifying health and each mental illness; 3) pairwise classification among mental illnesses. After matching, there were 32 patients with bipolar disorder; 106 patients with depression, 114 healthy patients, and 20 with schizophrenia. We extracted MFCC features from speech and extracted i-vectors. After logistic regression modeling and model performance examination, results show that the AUC value of the model for classification of depression and bipolar disorder is 0.5 (F score =0.44). But for other classification tasks, the AUC value is between 0.75 and 0.92 (F-score range: 0.73~0.91). For model comparison, difference test found that the performance of the classification model of depression and bipolar disorder (AUC) was significantly worse than the classification model for bipolar disorder and schizophrenia (adjusted P <0.05). The difference between other classification task models is not significant. Speech features are not ideal for the classification of depression and bipolar disorder.

This study systematically explores the relationship between speech features and depression, and illustrates the following: (1) speech features can significantly predict depression, and speech significantly contributes to depression; (2) speech features predict depression practically, the model has high generalization ability; (3) the predictive role of speech is stable across contexts and emotions; (4) speech can have higher discrimination ability in the complex context of clinical diagnosis of mental illness. These studies provide an alternative basis for further exploring the possibilities of speech as a diagnostic tool for clinical depression.

關(guān)鍵詞抑郁癥語(yǔ)音特征輔助診斷機(jī)器學(xué)習(xí) 分類識(shí)別 學(xué)位類型博士 語(yǔ)種中文 學(xué)位名稱理學(xué)博士 學(xué)位專業(yè)應(yīng)用心理學(xué) 學(xué)位授予單位中國(guó)科學(xué)院心理研究所 學(xué)位授予地點(diǎn)中國(guó)科學(xué)院心理研究所 文獻(xiàn)類型學(xué)位論文 條目標(biāo)識(shí)符http://ir.psych.ac.cn/handle/311026/31761 專題社會(huì)與工程心理學(xué)研究室
推薦引用方式
GB/T 7714 潘瑋. 語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究[D]. 中國(guó)科學(xué)院心理研究所. 中國(guó)科學(xué)院心理研究所,2020. 條目包含的文件 文件名稱/大小 文獻(xiàn)類型 版本類型 開(kāi)放類型 使用許可 潘瑋-博士學(xué)位論文.pdf（2623KB）學(xué)位論文限制開(kāi)放CC BY-NC-SA請(qǐng)求全文 個(gè)性服務(wù)推薦該條目保存到收藏夾查看訪問(wèn)統(tǒng)計(jì)導(dǎo)出為Endnote文件谷歌學(xué)術(shù)谷歌學(xué)術(shù)中相似的文章[潘瑋]的文章百度學(xué)術(shù)百度學(xué)術(shù)中相似的文章[潘瑋]的文章必應(yīng)學(xué)術(shù)必應(yīng)學(xué)術(shù)中相似的文章[潘瑋]的文章相關(guān)權(quán)益政策暫無(wú)數(shù)據(jù)收藏/分享

除非特別說(shuō)明，本系統(tǒng)中所有內(nèi)容都受版權(quán)保護(hù)，并保留所有權(quán)利。

網(wǎng)址: 語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究 http://www.u1s5d6.cn/newsview44859.html

91高清中文字幕|亚洲无码网站网址|欧美一区二区乱伦|a乱码精品一区二区三|成人一区二区毛片|国产日韩精品视频短片|不卡无码无需播放器|鲁噜精品免费视频|wwwh日韩中出|精品五月婷婷无码

語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究

推薦資訊

從出汗看健康出汗透露你的健康信號(hào)

早上怎么喝水最健康？

91高清中文字幕|亚洲无码网站网址|欧美一区二区乱伦|a乱码精品一区二区三|成人一区二区毛片|国产日韩精品视频短片|不卡无码无需播放器|鲁噜精品免费视频|wwwh日韩中出|精品五月婷婷无码

語(yǔ)音識(shí)別抑郁癥的關(guān)鍵技術(shù)研究

推薦資訊

從出汗看健康 出汗透露你的健康信號(hào)

早上怎么喝水最健康？

從出汗看健康出汗透露你的健康信號(hào)

早上怎么喝水最健康？