Show simple item record

dc.contributor.authorMehmood, Irfan
dc.contributor.authorSajjad, M.
dc.contributor.authorRho, S.
dc.contributor.authorBaik, S.W.
dc.date.accessioned2019-07-18T12:41:49Z
dc.date.accessioned2019-08-05T10:14:58Z
dc.date.available2019-07-18T12:41:49Z
dc.date.available2019-08-05T10:14:58Z
dc.date.issued2016-01
dc.identifier.citationMehmood I, Sajjad M, Rho S et al (2016) Divide-and-conquer based summarization framework for extracting affective video content. Neurocomputing. 174(Part A): 393-403.en_US
dc.identifier.urihttp://hdl.handle.net/10454/17185
dc.descriptionYesen_US
dc.description.abstractRecent advances in multimedia technology have led to tremendous increases in the available volume of video data, thereby creating a major requirement for efficient systems to manage such huge data volumes. Video summarization is one of the key techniques for accessing and managing large video libraries. Video summarization can be used to extract the affective contents of a video sequence to generate a concise representation of its content. Human attention models are an efficient means of affective content extraction. Existing visual attention driven summarization frameworks have high computational cost and memory requirements, as well as a lack of efficiency in accurately perceiving human attention. To cope with these issues, we propose a divide-and-conquer based framework for an efficient summarization of big video data. We divide the original video data into shots, where an attention model is computed from each shot in parallel. Viewer's attention is based on multiple sensory perceptions, i.e., aural and visual, as well as the viewer's neuronal signals. The aural attention model is based on the Teager energy, instant amplitude, and instant frequency, whereas the visual attention model employs multi-scale contrast and motion intensity. Moreover, the neuronal attention is computed using the beta-band frequencies of neuronal signals. Next, an aggregated attention curve is generated using an intra- and inter-modality fusion mechanism. Finally, the affective content in each video shot is extracted. The fusion of multimedia and neuronal signals provides a bridge that links the digital representation of multimedia with the viewer’s perceptions. Our experimental results indicate that the proposed shot-detection based divide-and-conquer strategy mitigates the time and computational complexity. Moreover, the proposed attention model provides an accurate reflection of the user preferences and facilitates the extraction of highly affective and personalized summaries.en_US
dc.description.sponsorshipSupported by the ICT R&D program of MSIP/IITP. [2014(R0112-14-1014), The Development of Open Platform for Service of Convergence Contents].en_US
dc.language.isoenen_US
dc.relation.isreferencedbyhttps://doi.org/10.1016/j.neucom.2015.05.126en_US
dc.rights© 2015 Published by Elsevier B.V. Reproduced in accordance with the publisher's self-archiving policy. This manuscript version is made available under the CC-BY-NC-ND 4.0 license.en_US
dc.subjectAffective content analysisen_US
dc.subjectBig video dataen_US
dc.subjectDivide-and-conquer-architectureen_US
dc.subjectHuman attention modelingen_US
dc.titleDivide-and-conquer based summarization framework for extracting affective video contenten_US
dc.status.refereedYesen_US
dc.date.Accepted2015-05-01
dc.date.application2015-09-01
dc.typeArticleen_US
dc.type.versionAccepted manuscripten_US
dc.date.updated2019-07-18T11:41:51Z
refterms.dateFOA2019-08-05T10:15:23Z


Item file(s)

Thumbnail
Name:
Divide-and-conquerbasedsummari ...
Size:
1.422Mb
Format:
PDF
Description:
Mehmood_Neurocomputing

This item appears in the following Collection(s)

Show simple item record