000063148 001__ 63148
000063148 005__ 20171109140606.0
000063148 0247_ $$2doi$$a10.1186/1687-4722-2013-23
000063148 0248_ $$2sideral$$a84313
000063148 037__ $$aART-2013-84313
000063148 041__ $$aeng
000063148 100__ $$aTejedor, J.
000063148 245__ $$aQuery-by-example spoken term detection ALBAYZIN 2012 evaluation: Overview, systems, results, and discussion
000063148 260__ $$c2013
000063148 5060_ $$aAccess copy available to the general public$$fUnrestricted
000063148 5203_ $$aQuery-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from automatic speech recognition (ASR) and keyword spotting (KWS)/spoken term detection (STD) since ASR is interested in all the terms/words that appear in the speech signal and KWS/STD relies on a textual transcription of the search term to retrieve the speech data. This paper presents the systems submitted to the ALBAYZIN 2012 QbE STD evaluation held as a part of ALBAYZIN 2012 evaluation campaign within the context of the IberSPEECH 2012 Conferencea. The evaluation consists of retrieving the speech files that contain the input queries, indicating their start and end timestamps within the appropriate speech file. Evaluation is conducted on a Spanish spontaneous speech database containing a set of talks from MAVIR workshopsb, which amount at about 7 h of speech in total. We present the database metric systems submitted along with all results and some discussion. Four different research groups took part in the evaluation. Evaluation results show the difficulty of this task and the limited performance indicates there is still a lot of room for improvement. The best result is achieved by a dynamic time warping-based search over Gaussian posteriorgrams/posterior phoneme probabilities. This paper also compares the systems aiming at establishing the best technique dealing with that difficult task and looking for defining promising directions for this relatively novel task.
000063148 540__ $$9info:eu-repo/semantics/openAccess$$aby$$uhttp://creativecommons.org/licenses/by/3.0/es/
000063148 590__ $$a0.382$$b2013
000063148 591__ $$aENGINEERING, ELECTRICAL & ELECTRONIC$$b214 / 246 = 0.87$$c2013$$dQ4$$eT3
000063148 591__ $$aACOUSTICS$$b27 / 31 = 0.871$$c2013$$dQ4$$eT3
000063148 655_4 $$ainfo:eu-repo/semantics/article$$vinfo:eu-repo/semantics/publishedVersion
000063148 700__ $$aToledano, D.T.
000063148 700__ $$aAnguera, X.
000063148 700__ $$aVarona, A.
000063148 700__ $$aHurtado, L.F.
000063148 700__ $$0(orcid)0000-0001-5803-4316$$aMiguel, A.$$uUniversidad de Zaragoza
000063148 700__ $$aColás,J.
000063148 7102_ $$15008$$2800$$aUniversidad de Zaragoza$$bDepartamento de Ingeniería Electrónica y Comunicaciones$$cTeoría de la Señal y Comunicaciones
000063148 773__ $$g2013, 23 (2013), [17 pp]$$pEURASIP j. audio, speech music. process.$$tEURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING$$x1687-4714
000063148 8564_ $$s998152$$uhttp://zaguan.unizar.es/record/63148/files/texto_completo.pdf$$yVersión publicada
000063148 8564_ $$s108693$$uhttp://zaguan.unizar.es/record/63148/files/texto_completo.jpg?subformat=icon$$xicon$$yVersión publicada
000063148 909CO $$ooai:zaguan.unizar.es:63148$$particulos$$pdriver
000063148 951__ $$a2017-11-09-11:57:47
000063148 980__ $$aARTICLE