Monday, June 16, 2014

Plans for Computer Vision Analysis

Text Processing

We used text detection and OCR to extract text automatically. We use a text line detection code to generates bounding boxes for textlines. Each image takes about 1 minute to be processed. For OCR we use Tesseract. Text detection results are here:

Future Plan

Here is what we discussed. After we extract text, we can try removing the text and detect the major graphics elements by parsing the image. We can over-segment the image into many super-pixels. Then, using computer vision techniques we can group and classify super-pixels into elements such as graphics, photo or background. We can either extract bounding boxes from segments or generate a heat map or a feature representation that could be later used to improve search.



No comments:

Post a Comment