Monday, July 7, 2014

Text detection pipeline


  1. Partition image pixels using k-means
  2. Extract connected components
  3. For each connected component extract the following features:
    1. Color histogram  and a measure of consistency
    2. Stroke Width Transform histogram and a measure of consistency
    3. Number of holes in a component
  4. For every pair of connected components compute the following similarity metrics:
    1. Color Similarity
    2. Stroke Width Similarity
    3. Pixel distance in x and y directions separately
    4. Graph distance between every pair of elements (this means how many times you need to move from one component to a neighboring component to reach from component A to component B).
  5. Merge component according to the similarity metrics. We have a hand-learned criteria.
  6. Classify each component according to whether it looks like text or not. (We have a hand-set criteria.) Use the following features for text classification:
    1. Area (the number of pixels)
    2. Stroke Width Consistency
    3. Width/Height
    4. relative Stroke Length: Area/(Stroke_Width^2)
    5. Sparsity
  7. Display text regions together with their bounding boxes

Here are the outputs:
https://dl.dropboxusercontent.com/u/20022261/reports/text_segmentation_benchmark.html

No comments:

Post a Comment