Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type

View/ Open
salminen_et_al_2019.pdf (584.9Kb)
Download
Publication date
2019-08Rights
© 2019 Elsevier. Reproduced in accordance with the publisher's self-archiving policy. This manuscript version is made available under the CC-BY-NC-ND 4.0 license (http://creativecommons.org/licenses/by-nc-nd/4.0/)Peer-Reviewed
YesOpen Access status
openAccessAccepted for publication
12/04/2019
Metadata
Show full item recordAbstract
As complex data becomes the norm, greater understanding of machine learning (ML) applications is needed for content marketers. Unstructured data, scattered across platforms in multiple forms, impedes performance and user experience. Automated classification offers a solution to this. We compare three state-of-the-art ML techniques for multilabel classification - Random Forest, K-Nearest Neighbor, and Neural Network - to automatically tag and classify online news articles. Neural Network performs the best, yielding an F1 Score of 70% and provides satisfactory cross-platform applicability on the same organisation's YouTube content. The developed model can automatically label 99.6% of the unlabelled website and 96.1% of the unlabelled YouTube content. Thus, we contribute to marketing literature via comparative evaluation of ML models for multilabel content classification, and cross-channel validation for a different type of content. Results suggest that organisations may optimise ML to auto-tag content across various platforms, opening avenues for aggregated analyses of content performance.Version
Accepted manuscriptCitation
Salminen J, Yoganathan V, Corporan J et al (2019) Machine learning approach to auto-tagging online content for content marketing efficiency: A comparative analysis between methods and content type. Journal of Business Research. 101: 203-217.Link to Version of Record
https://doi.org/10.1016/j.jbusres.2019.04.018Type
Articleae974a485f413a2113503eed53cd6c53
https://doi.org/10.1016/j.jbusres.2019.04.018
Scopus Count
Collections
Related items
Showing items related by title, author, creator and subject.
-
Electronic word of mouth in social media: the common characteristics of retweeted and favourited marketer-generated content posted on TwitterAlboqami, H.; Al-Karaghouli, W.; Baeshen, Y.; Erkan, I.; Evans, C.; Ghoneim, Ahmad (2015)Marketers desire to utilise electronic word of mouth (eWOM) marketing on social media sites. However, not all online content generated by marketers has the same effect on consumers; some of them are effective while others are not. This paper aims to examine different characteristics of marketer-generated content (MGC) that of which one lead users to eWOM. Twitter was chosen as one of the leading social media sites and a content analysis approach was employed to identify the common characteristics of retweeted and favourited tweets. 2,780 tweets from six companies (Booking, Hostelworld, Hotels, Lastminute, Laterooms and Priceline) operating in the tourism sector are analysed. Results indicate that the posts which contain pictures, hyperlinks, product or service information, direct answers to customers and brand centrality are more likely to be retweeted and favourited by users. The findings present the main eWOM drivers for MGC in social media.
-
Semantic content analysis for effective video segmentation, summarisation and retrieval.Jiang, Jianmin; Ipson, Stanley S.; Ren, Jinchang (University of BradfordDepartment of Electronic Imaging and Media Communications, 2010-03-10)This thesis focuses on four main research themes namely shot boundary detection, fast frame alignment, activity-driven video summarisation, and highlights based video annotation and retrieval. A number of novel algorithms have been proposed to address these issues, which can be highlighted as follows. Firstly, accurate and robust shot boundary detection is achieved through modelling of cuts into sub-categories and appearance based modelling of several gradual transitions, along with some novel features extracted from compressed video. Secondly, fast and robust frame alignment is achieved via the proposed subspace phase correlation (SPC) and an improved sub-pixel strategy. The SPC is proved to be insensitive to zero-mean-noise, and its gradient-based extension is even robust to non-zero-mean noise and can be used to deal with non-overlapped regions for robust image registration. Thirdly, hierarchical modelling of rush videos using formal language techniques is proposed, which can guide the modelling and removal of several kinds of junk frames as well as adaptive clustering of retakes. With an extracted activity level measurement, shot and sub-shot are detected for content-adaptive video summarisation. Fourthly, highlights based video annotation and retrieval is achieved, in which statistical modelling of skin pixel colours, knowledge-based shot detection, and improved determination of camera motion patterns are employed. Within these proposed techniques, one important principle is to integrate various kinds of feature evidence and to incorporate prior knowledge in modelling the given problems. High-level hierarchical representation is extracted from the original linear structure for effective management and content-based retrieval of video data. As most of the work is implemented in the compressed domain, one additional benefit is the achieved high efficiency, which will be useful for many online applications.
-
Video extraction for fast content access to MPEG compressed videosJiang, Jianmin; Weng, Y. (2009-06-09)As existing video processing technology is primarily developed in the pixel domain yet digital video is stored in compressed format, any application of those techniques to compressed videos would require decompression. For discrete cosine transform (DCT)-based MPEG compressed videos, the computing cost of standard row-by-row and column-by-column inverse DCT (IDCT) transforms for a block of 8 8 elements requires 4096 multiplications and 4032 additions, although practical implementation only requires 1024 multiplications and 896 additions. In this paper, we propose a new algorithm to extract videos directly from MPEG compressed domain (DCT domain) without full IDCT, which is described in three extraction schemes: 1) video extraction in 2 2 blocks with four coefficients; 2) video extraction in 4 4 blocks with four DCT coefficients; and 3) video extraction in 4 4 blocks with nine DCT coefficients. The computing cost incurred only requires 8 additions and no multiplication for the first scheme, 2 multiplication and 28 additions for the second scheme, and 47 additions (no multiplication) for the third scheme. Extensive experiments were carried out, and the results reveal that: 1) the extracted video maintains competitive quality in terms of visual perception and inspection and 2) the extracted videos preserve the content well in comparison with those fully decompressed ones in terms of histogram measurement. As a result, the proposed algorithm will provide useful tools in bridging the gap between pixel domain and compressed domain to facilitate content analysis with low latency and high efficiency such as those applications in surveillance videos, interactive multimedia, and image processing.