Recognizing Art Style Automatically in Painting with Deep Learning

Recognizing Art Style Automatically in painting with deep learning

Table of Contents




Abstract

The artistic style (or artistic movement) of a painting is a rich descriptor that captures both visual and historical information about the painting. Correctly identifying the artistic style of a paintings is crucial for indexing large artistic databases. In this paper, we investigate the use of deep residual neural to solve the problem of detecting the art style of a painting and outperform existing approaches to reach an accuracy of 62% on the Wiki paintings dataset (for 25 different style). To achieve this result, the network is first pre-trained on ImageNet, and deeply retrained for artistic style. We empirically evaluate that to achieve the best performance, one need to retrain about 20 layers. This suggests that the two tasks are as similar as expected, and explain the previous success of hand crafted features. We also demonstrate that the style detected on the Wiki paintings dataset are consistent with styles detected on an independent dataset and describe a number of experiments we conducted to validate this approach both qualitatively and quantitatively.

Keywords

Art style recognition, Painting, Feature extraction, Deep learning

Introduction

The Metropolitan Museum of New York has recently released over 375,000 pictures of public domain art-work that will soon be available for indexation (met, 2008). However, indexing artistic pictures requires a description of the visual style of the picture, in addition to the description of the content which is typically used to index non-artistic images. Identifying the style of a picture in a fully automatic way is a challenging problem. Indeed, although standard classification tasks such as facial recognition can rely on clearly identifiable features such as eyes or nose, classifying visual styles cannot rely on any definitive feature. As pointed by Tan et al. (2016) the problem is particularly difficult for nonrepresentational art-work. Several academic research papers have addressed the problem of style recognition with existing machine learning approaches. For example Florea et al. (2016) evaluated the performance of different combinations of popular image features (Histograms of gradients, spatial envelopes, discriminative color names, etc.) with different classification algorithms (SVM, random forests, etc.). Despite the size of the dataset and the limited number of labels to predict (only 12 art movements in total), they observed that several styles remain hard to distinguish with these techniques. They also demonstrate that adding more features does not improve the accuracy of the models any further, presumably because of the curse of dimensionality.

In 2014, Karayev et al. observed in that most systems designed for automatic style recognition were built on hand-crafted features, and manage to recognize a larger variety of visual style using a linear classifier trained with features extracted automatically using the deep convolutional neural network (CNN). The CNN they used was AlexNet (Krizhevsky et al., 2012) and was trained on ImageNet to recognize objects in non-artistic photographs. This approach was able to beat the state-of-the-art, but the authors obtained even better results on the same datasets using complex hand-crafted features such as MC-bit (Bergamo and Torresani, 2012). More recently Tan et al. (2016) addressed the problem using a variation of the same neural network and managed to achieve the best performance with a fully automatic procedure for the first time, with an accuracy of 54.5% over 25 styles. In this paper, we further improve upon the state of the art and achieve over 62% accuracy on the same dataset and using a similar experimental protocol. This important improvement is due to two important contributions described below. First, we demonstrate that Residual Neural Networks which have proven to be more efficient on object recognition tasks, also provide great performance on art style recognition. Although it may be expectable, this result is not immediately implied by the success of Resent on object recognition, since art style recognition requires different types of features (as it will be shown in the paper). As a matter of fact, state-of-the-art neural network architectures for object recognition have not always been state-of-the-art for art style recognition. (See, Karayev et al. (2014) discussed above.)

Second, we demonstrate that a deeper retraining procedure is required to obtain the best performance from models pre-trained on ImageNet. In particular, we describe an experiment in which we have retrained an increasing number of layers — starting from retraining the last layer only, to retraining the complete network — and show that the best performances with about 20 layers retrained (in contrast with previous work where only the last layer was retrained). This result suggests that high level features trained on image net are not optimal to infer style. In addition, the paper contains a number of other methodological insights. For example, we show that the styles learnt using one dataset are consistent with the styles from another independent dataset. This shows that 1) our classifier does not overfit the training dataset 2) styles are consistent with the artistic style defined by art experts. The rest of the paper is organized as follows : in Section 2 we formally state the problem that is being addressed in this paper and we present various baselines we used to assess the performance of our new approach, then, in Section 3 we describe our approach, and present experimental results in Section 4. Finally we conclude with some comments and future directions.

Conclusion

In this paper, we successfully applied a deep learning approach to achieve over 62% accuracy on WikiArt data. This improvement is mainly due the use of a residual neural network and to the importance of retraining. However, the obtained results, we provided some empirical evidences for our choices of parameter and then brought a methodological contribution. As suggested by our experiments, the use of bigger datasets should enable to learn more layers of our deep networks and should improve the accuracy of our models. Hence, we plan to extend our datasets so that we can perform a full retraining of our deep models. As a future work, we plan to apply the deep network analysis proposed in (Montavon et al., 2016; Binder et al., 2016), once it will be available for ResNet, in order to have an understanding of the learnt feature. By analyzing every layer in this way, we will be able to see how the networks characterize a given style.

About KSRA

The Kavian Scientific Research Association (KSRA) is a non-profit research organization to provide research / educational services in December 2013. The members of the community had formed a virtual group on the Viber social network. The core of the Kavian Scientific Association was formed with these members as founders. These individuals, led by Professor Siavosh Kaviani, decided to launch a scientific / research association with an emphasis on education.

KSRA research association, as a non-profit research firm, is committed to providing research services in the field of knowledge. The main beneficiaries of this association are public or private knowledge-based companies, students, researchers, researchers, professors, universities, and industrial and semi-industrial centers around the world.

Our main services Based on Education for all Spectrum people in the world. We want to make an integration between researches and educations. We believe education is the main right of Human beings. So our services should be concentrated on inclusive education.

The KSRA team partners with local under-served communities around the world to improve the access to and quality of knowledge based on education, amplify and augment learning programs where they exist, and create new opportunities for e-learning where traditional education systems are lacking or non-existent.

FULL Paper PDF file:

Recognizing Art Style Automatically in painting with deep learning

Bibliography

author

Adrian Lecoutre adrian, Benjamin Negrevergne, Florian Yger

Year

2017

Title

Recognizing Art Style Automatically in painting with deep learning

Publish in

PDF reference and original file: Click here

 

 

Website | + posts

Nasim Gazerani was born in 1983 in Arak. She holds a Master's degree in Software Engineering from UM University of Malaysia.

Website | + posts

Professor Siavosh Kaviani was born in 1961 in Tehran. He had a professorship. He holds a Ph.D. in Software Engineering from the QL University of Software Development Methodology and an honorary Ph.D. from the University of Chelsea.

+ posts

Somayeh Nosrati was born in 1982 in Tehran. She holds a Master's degree in artificial intelligence from Khatam University of Tehran.