About: Audio-visual speech recognition     Goto   Sponge   NotDistinct   Permalink

An Entity of Type : owl:Thing, within Data Space : el.dbpedia.org associated with source document(s)

Audio visual speech recognition (AVSR) is a technique that uses image processing capabilities in lip reading to aid speech recognition systems in recognizing undeterministic phones or giving preponderance among near probability decisions.

AttributesValues
rdfs:label
  • Audio-visual speech recognition (en)
rdfs:comment
  • Audio visual speech recognition (AVSR) is a technique that uses image processing capabilities in lip reading to aid speech recognition systems in recognizing undeterministic phones or giving preponderance among near probability decisions. (en)
sameAs
dbp:wikiPageUsesTemplate
Subject
Link from a Wikipage to an external page
prov:wasDerivedFrom
Wikipage page ID
page length (characters) of wiki page
Wikipage revision ID
Link from a Wikipage to another Wikipage
has abstract
  • Audio visual speech recognition (AVSR) is a technique that uses image processing capabilities in lip reading to aid speech recognition systems in recognizing undeterministic phones or giving preponderance among near probability decisions. Each system of lip reading and speech recognition works separately, then their results are mixed at the stage of . As the name suggests, it has two parts. First one is the audio part and second one is the visual part. In audio part we use features like log mel spectogram, mfcc etc. from the raw audio samples and we build a model to get feature vector out of it . For visual part generally we use some variant of convolutional neural network to compress the image to a feature vector after that we concatenate these two vectors (audio and visual ) and try to predict the target object. (en)
foaf:isPrimaryTopicOf
is Wikipage redirect of
is Link from a Wikipage to another Wikipage of
is foaf:primaryTopic of
Faceted Search & Find service v1.17_git151 as of Feb 20 2025


Alternative Linked Data Documents: ODE     Content Formats:   [cxml] [csv]     RDF   [text] [turtle] [ld+json] [rdf+json] [rdf+xml]     ODATA   [atom+xml] [odata+json]     Microdata   [microdata+json] [html]    About   
This material is Open Knowledge   W3C Semantic Web Technology [RDF Data] Valid XHTML + RDFa
OpenLink Virtuoso version 07.20.3240 as of Nov 11 2024, on Linux (x86_64-ubuntu_focal-linux-gnu), Single-Server Edition (72 GB total memory, 1 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software