首頁 資訊 語音識別抑郁癥的關(guān)鍵技術(shù)研究

語音識別抑郁癥的關(guān)鍵技術(shù)研究

來源:泰然健康網(wǎng) 時(shí)間:2024年11月24日 02:34
QQ客服官方微博反饋留言 語音識別抑郁癥的關(guān)鍵技術(shù)研究 其他題名Key Technologies of Detecting Depression with Voice Features 潘瑋 2020-05 摘要

抑郁癥是一種以抑郁癥狀為核心并伴隨大量其他癥狀的精神疾病。目前診斷以主觀為主,而客觀的評估工具對促進(jìn)抑郁癥的更加快速和準(zhǔn)確的治療尤為重要。語音數(shù)據(jù)臨床容易獲取,但是語音與抑郁癥二者之間還存在以下問題:語音特征是否顯著預(yù)測抑郁癥,納人混淆變量一一人口學(xué)信息后,語音對抑郁癥預(yù)測的貢獻(xiàn)大小;語音特征能否區(qū)分是否抑郁;二者關(guān)聯(lián)是否跨情境跨情緒穩(wěn)定;以及語音特征是否能夠在復(fù)雜臨床診斷情境中保持高鑒別力。

研究一通過二元邏輯回歸模型調(diào)查語音特征與抑郁癥之間的關(guān)聯(lián)是否顯著。并納人人口統(tǒng)計(jì)學(xué)信息,將其對預(yù)測是否抑郁的貢獻(xiàn)作為基線水平。本研究收集584抑郁癥患者和548名健康人的語音數(shù)據(jù)。結(jié)果發(fā)現(xiàn),有四種語音特征對抑郁癥預(yù)測起到了主要貢獻(xiàn):PC1 (OR=0.58, P <0.0001) , PC6 (OR=1.57, P <0.001) ,PC17 ( OR = 1. 53 , P <0.0001)和PC24 ( OR = 1.45 , P <0. 05 )。語音特征對抑郁癥的單獨(dú)貢獻(xiàn)達(dá)到了35.65% (Nagelkerke's R2)。

研究二設(shè)立三種分類模型:單獨(dú)基于語音的模型;單獨(dú)基于人口學(xué)變量的模型;基于語音與人口學(xué)變量的模型。同時(shí)該研究納人了其他數(shù)據(jù)集作為測試集以便說明模型的泛化能力。本研究包含三個(gè)語音數(shù)據(jù)集,數(shù)據(jù)集一同研究一,用于分類模型構(gòu)建。數(shù)據(jù)集二包含500名抑郁癥患者,404名健康人。數(shù)據(jù)集三包含45名抑郁癥患者與58名健康人。結(jié)果發(fā)現(xiàn),與以人口學(xué)變量建立的抑郁癥分類預(yù)測模型相比,包含語音的模型(單獨(dú)基于語音的模型;基于語音和人口學(xué)變量建立的模型)一致的達(dá)到了較高的分類準(zhǔn)確性(F-measure)。在其他數(shù)據(jù)集上進(jìn)行測試,得到的結(jié)果也是一致的。在該研究中,語音特征單獨(dú)預(yù)測模型在不同測試集上的分類準(zhǔn)確性均達(dá)到80% 。

研究三收集了45名抑郁癥患者與58名健康人的語音數(shù)據(jù)。研究采用了3(情緒狀態(tài):正性,中性,負(fù)性)*3(任務(wù)類型:語言問答,文本朗讀,圖片描述)的實(shí)驗(yàn)設(shè)計(jì),運(yùn)用機(jī)器學(xué)習(xí)分類算法一一邏輯回歸(Logistic Regression, LR)來構(gòu)建抑郁識別模型。實(shí)驗(yàn)結(jié)果表明,語音對不同情境下不同情緒狀態(tài)下的AUC值均在0.6以上(65.7-80.9),語音的抑郁識別準(zhǔn)確性可以達(dá)到82.9% o

研究四設(shè)定了三種不同的分類任務(wù):1)對健康與非健康組進(jìn)行分類;2)對健康組與各種精神疾病進(jìn)行分類;3)對精神疾病兩兩分類。匹配后有32名躁郁癥患者,抑郁癥患者106例,健康患者114例,精神分裂癥患者20例。從語音中提取MFCC特征并抽取i-vectors。邏輯回歸模型評估結(jié)果顯示:分類抑郁癥和雙相障礙的模型AUC值為0.5 (F-score=0.44 )。對于其他分類任務(wù),AUC值均在0.75到0.92之間(F-score:0.73~0.91)。在模型性能的比較上,差異檢驗(yàn)發(fā)現(xiàn),抑郁癥和雙相障礙分類模型的性能(AUC )顯著差于針對雙相障礙與精神分裂癥的分類模型(corrected P < 0.05 )。其他分類任務(wù)模型好壞差異不顯著。而語音特征對抑郁癥和雙相障礙的分類效果不理想。

本研究對語音特征與抑郁癥的關(guān)系進(jìn)行了系統(tǒng)的探討,說明了以下幾點(diǎn):(1)語音特征能夠顯著預(yù)測抑郁癥,語音對抑郁癥具有可觀的貢獻(xiàn);(2)語音特征能夠?qū)嶋H預(yù)測抑郁癥,模型具有一定的泛化能力;(3)語音的預(yù)測作用是跨情境跨情緒穩(wěn)定的;(4)語音能夠在精神疾病臨床診斷的復(fù)雜情境下具有較高的鑒別能力。這些關(guān)鍵技術(shù)研究為進(jìn)一步探究語音作為臨床抑郁癥診斷工具的可能性奠定了堅(jiān)實(shí)的基礎(chǔ)。

其他摘要

Depression is characterized with depressed mood and other complicated symptoms, which makes it particularly important to find a more objective assessment tool to promote faster and more accurate treatment of depression. Speech data is easily accessible clinically, and the research between speech and depression is still problematic in following aspects: whether speech features can significantly predict depression, and to what extent speech contributes predicting depression, comparing to confounding factors-demographic information; whether speech features can successfully classify depression or not in practice; can speech features predict depression across different contexts and emotions; and whether speech features can maintain good discrimination power in complex clinical diagnostic situations.

In study 1,demographic information is included, whose contribution to predicting depression is taken as the baseline. This study collected speech data from 584 depression patients and 548 healthy people. Results showed that there are multiple speech characteristics for depression PC1 (OR=0.58, P <0.0001), PC6 (OR=1.57, P<0.001), PC17 (0R=1.53, P <0.0001), and PC24 (OR=1.45,P <0.05). Speech features alone contributed to depression with an amount of 35.65% (Nagelkerke's R2).

Study 2 established three classification models: independent speech-based models; separate demographic-based models; and models based on both speech and demographic variables. The study contains three datasets. Dataset 1 is the same as in study 1 for classification model construction and testing. Dataset 2 contains 500 depression patients, 404 healthy people. And dataset 3 contains 45 depression patients and 58 healthy people. It showed that, compared with demographic variables, models including speech all reached generally higher predicting accuracy (F -measure). Even when tested on other data sets, the results are consistent. Voice input only model tested on different test sets all reach 80%.

Study 3 collected 45 depressed patients and 58 healthy people. The research adopted 3 (emotional state: positive, neutral, negative)*3 (task type: question and answer,text reading, picture description) experimental design. With classification algorithm-Logistic Regression. Results found that the average value of AUC of speech in different situations and different emotional states is universally above 0.6 (65.7-80.9), and the accuracy of depression recognition of speech reached 82.9%.Study 4 set up 3 type classification tasks: 1) classifying healthy and non-healthy groups; 2) classifying health and each mental illness; 3) pairwise classification among mental illnesses. After matching, there were 32 patients with bipolar disorder; 106 patients with depression, 114 healthy patients, and 20 with schizophrenia. We extracted MFCC features from speech and extracted i-vectors. After logistic regression modeling and model performance examination, results show that the AUC value of the model for classification of depression and bipolar disorder is 0.5 (F score =0.44). But for other classification tasks, the AUC value is between 0.75 and 0.92 (F-score range: 0.73~0.91). For model comparison, difference test found that the performance of the classification model of depression and bipolar disorder (AUC) was significantly worse than the classification model for bipolar disorder and schizophrenia (adjusted P <0.05). The difference between other classification task models is not significant. Speech features are not ideal for the classification of depression and bipolar disorder.

This study systematically explores the relationship between speech features and depression, and illustrates the following: (1) speech features can significantly predict depression, and speech significantly contributes to depression; (2) speech features predict depression practically, the model has high generalization ability; (3) the predictive role of speech is stable across contexts and emotions; (4) speech can have higher discrimination ability in the complex context of clinical diagnosis of mental illness. These studies provide an alternative basis for further exploring the possibilities of speech as a diagnostic tool for clinical depression.

關(guān)鍵詞抑郁癥 語音特征 輔助診斷 機(jī)器學(xué)習(xí) 分類識別 學(xué)位類型博士 語種中文 學(xué)位名稱理學(xué)博士 學(xué)位專業(yè)應(yīng)用心理學(xué) 學(xué)位授予單位中國科學(xué)院心理研究所 學(xué)位授予地點(diǎn)中國科學(xué)院心理研究所 文獻(xiàn)類型學(xué)位論文 條目標(biāo)識符http://ir.psych.ac.cn/handle/311026/31761 專題社會與工程心理學(xué)研究室
推薦引用方式
GB/T 7714 潘瑋. 語音識別抑郁癥的關(guān)鍵技術(shù)研究[D]. 中國科學(xué)院心理研究所. 中國科學(xué)院心理研究所,2020. 條目包含的文件 文件名稱/大小 文獻(xiàn)類型 版本類型 開放類型 使用許可 潘瑋-博士學(xué)位論文.pdf(2623KB)學(xué)位論文 限制開放CC BY-NC-SA請求全文 個(gè)性服務(wù)推薦該條目保存到收藏夾查看訪問統(tǒng)計(jì)導(dǎo)出為Endnote文件谷歌學(xué)術(shù)谷歌學(xué)術(shù)中相似的文章[潘瑋]的文章百度學(xué)術(shù)百度學(xué)術(shù)中相似的文章[潘瑋]的文章必應(yīng)學(xué)術(shù)必應(yīng)學(xué)術(shù)中相似的文章[潘瑋]的文章相關(guān)權(quán)益政策暫無數(shù)據(jù)收藏/分享

除非特別說明,本系統(tǒng)中所有內(nèi)容都受版權(quán)保護(hù),并保留所有權(quán)利。

相關(guān)知識

抗抑郁創(chuàng)新中藥進(jìn)入臨床Ⅲ期研究,抑郁癥患者將獲得新療法
關(guān)于強(qiáng)化我國孕婦產(chǎn)后抑郁癥防治的提案
老年人抑郁癥有哪些癥狀
孕早期不良飲食因素與抑郁的關(guān)聯(lián)性
關(guān)注產(chǎn)后抑郁,關(guān)愛女性孕產(chǎn)期心理健康
老人也會患上抑郁 老人如何擺脫抑郁
產(chǎn)后媽媽為什么會得抑郁癥
抑郁癥自測 抑郁癥的九大癥狀
孕期怎么吃?來看最新的孕期營養(yǎng)補(bǔ)充劑研究?。笅胫?/a>
減胎術(shù)對孕婦心理健康影響的研究進(jìn)展

網(wǎng)址: 語音識別抑郁癥的關(guān)鍵技術(shù)研究 http://www.u1s5d6.cn/newsview44859.html

推薦資訊