OCR with Azure Cognitive Services

We will use Bing Vision APIs to read Text from Images.

First, login to portal.azure.com and create a new Resource Group, say ‘Translator’. In this resource group, Add

Custom Vision

Free tier is good enough for the experimentation.

Take the Service Keys to be used from your app as below.

readonly static string uri = "https://westcentralus.api.cognitive.microsoft.com/vision/v2.0/ocr?language=unk&detectOrientation=true";

//Use HttpClient
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", (string)Application.Current.Resources["BingImageSearchKey"]);
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
response = await client.PostAsync(uri, content);
responseText = await response.Content.ReadAsStringAsync();
var jsonResponse = JsonResponse.FromJson(responseText);

foreach (Region region in jsonResponse.Regions)
{
  foreach (OCRResponse.Line line in region.Lines)
  {
    foreach (Word word in line.Words)
    {
      sb.Append(word.Text); sb.Append(" ");
    }
    sb.AppendLine(); sb.Append(" ");
  }
}

Image is part of the content in above request.

Below is the Json Response from the Azure Vision API

namespace OCRResponse {
    public partial class JsonResponse {
        [JsonProperty("language")]
        public string Language {
            get;
            set;
        }

        [JsonProperty("orientation")]
        public string Orientation {
            get;
            set;
        }

        [JsonProperty("textAngle")]
        public double TextAngle {
            get;
            set;
        }

        [JsonProperty("regions")]
        public Region[] Regions {
            get;
            set;
        }
    }

    public partial class Region {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }

        [JsonProperty("lines")]
        public Line[] Lines {
            get;
            set;
        }
    }

    public partial class Line {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }

        [JsonProperty("words")]
        public Word[] Words {
            get;
            set;
        }
    }

    public partial class Word {
        [JsonProperty("boundingBox")]
        public string BoundingBox {
            get;
            set;
        }

        [JsonProperty("text")]
        public string Text {
            get;
            set;
        }
    }

    public partial class JsonResponse {
        public static JsonResponse FromJson(string json) => JsonConvert.DeserializeObject < JsonResponse > (json, OCRResponse.Converter.Settings);
    }

    public static class Serialize {
        public static string ToJson(this JsonResponse self) => JsonConvert.SerializeObject(self, OCRResponse.Converter.Settings);
    }

    internal static class Converter {
        public static readonly JsonSerializerSettings Settings = new JsonSerializerSettings {
            MetadataPropertyHandling = MetadataPropertyHandling.Ignore,
                DateParseHandling = DateParseHandling.None,
                Converters = {
                    new IsoDateTimeConverter {
                        DateTimeStyles = DateTimeStyles.AssumeUniversal
                    }
                },
        };
    }
}

Pretty simple, right?
This is all you need to do to write your OCR with Azure Vision APIs

Flawless!

Knowledge Is In The Darkness, Lighting My Way Through…

OCR with Azure Cognitive Services

2 thoughts on “OCR with Azure Cognitive Services”

Leave a comment Cancel reply

Share this:

Related

2 thoughts on “OCR with Azure Cognitive Services”

Leave a comment Cancel reply