Multilingual Text Detection on Scene Images using MASK RCNN Method

Postgraduate Thesis uoadl:2942661 8 Read counter

Unit:
Κατεύθυνση Μεγάλα Δεδομένα και Τεχνητή Νοημοσύνη
Πληροφορική
Deposit date:
2021-04-28
Year:
2021
Author:
Naoum Nikolaos
Supervisors info:
Κατσούρος Βασίλειος, Ερευνητής Α', Ερευνητικό Κέντρο Αθηνά
Original Title:
Multilingual Text Detection on Scene Images using MASK RCNN Method
Languages:
English
Translated title:
Multilingual Text Detection on Scene Images using MASK RCNN Method
Summary:
In the new era of technology, where innovations come day by day the new ideas, methods, procedures in the field of Computer sciences getting more advance. One of them research becomes most active topic these days is ‘text detection and recognition’ presents in images or videos. The accurate information present in the text is very useful for a wide range of real-life applications. However, it is a very complicated assignment to localize and read texts from natural scene images. Scene text detection and recognition application, due to majority and variety of these applications present in the market, it seeks the attention of community to the computer technology more and become curios. There are some problems which are unsolved till know in the world of text detection which are languages, colors, orientation, fonts, style that need to be resolved. The recent advancements in deep learning have increased the attention of potential researchers towards scene text detection. CNN is designed in the way that it automatically adapts spatial hierarchies of Text detection Processing features through using multiple building blocks - layers. First, it collects the text each word separately and after detecting the different parts of text it recollects the whole image and then present an output. In this report it is analyzed the techniques like LOMO and PMTD etc. for text detection. Our proposed method is using MASK RCNN technique and it is implemented and tested in order to offer a framework that has a powerful baseline and offers so many advantages such as flexibility, robustness, fast time of training and inference. All the methods help us to achieve two different activities which are instance segmentation and text detection on scene images. The methods that are proposed in the research have already offers some application that include the Multilingual text detection on scene images, but this is not meant that these applications are ideal to use or perfect. There is a lag and missing features in these applications and have a room for improvement which can help in achieving better results in terms of boundary detection. The world of technology improving day by day and being update with the technology and contribute in the technology is the best way to express greeting or thanks to the technology
Main subject category:
Technology - Computer science
Keywords:
text detection, CNN, detector, ICDAR, competition, Deep Learning, Computer Vision, bounding box, datasets, RCNN, algorithm, recognition, MASK R-CNN, hyper parameters
Index:
Yes
Number of index pages:
6
Contains images:
Yes
Number of references:
39
Number of pages:
84
MSc_Thesis_Nikolaos_Naoum.pdf (2 MB) Open in new window