标签:style blog http io color ar os 使用 for
光学字符识别(OCR,Optical Character Recognition)是指对文本资料进行扫描,然后对图像文件进行分析处理,获取文字及版面信息的过程。OCR技术非常专业,一般多是印刷、打印行业的从业人员使用,可以快速的将纸质资料转换为电子资料。关于中文OCR,目前国内水平较高的有清华文通、汉王、尚书,其产品各有千秋,价格不菲。国外OCR发展较早,像一些大公司,如IBM、微软、HP等,即使没有推出单独的OCR产品,但是他们的研发团队早已掌握核心技术,将OCR功能植入了自身的软件系统。对于我们程序员来说,一般用不到那么高级的,主要在开发中能够集成基本的OCR功能就可以了。这两天我查找了很多免费OCR软件、类库,特地整理一下,今天首先来谈谈Tesseract,下一次将讨论下Onenote 2010中的OCR API实现。可以在这里查看OCR技术的发展简史。
转载出处:http://www.cnblogs.com/brooks-dotnet/archive/2010/10/05/1844203.html
1、Tesseract概述
Tesseract的OCR引擎最先由HP实验室于1985年开始研发,至1995年时已经成为OCR业内最准确的三款识别引擎之一。然而,HP不久便决定放弃OCR业务,Tesseract也从此尘封。
数年以后,HP意识到,与其将Tesseract束之高阁,不如贡献给开源软件业,让其重焕新生--2005年,Tesseract由美国内华达州信息技术研究所获得,并求诸于Google对Tesseract进行改进、消除Bug、优化工作。
Tesseract目前已作为开源项目发布在Google Project,其项目主页在这里查看,其最新版本3.0已经支持中文OCR,并提供了一个命令行工具。本次我们来测试一下Tesseract 3.0,由于命令行对最终用户不太友好,我用WPF简单封装了一下,就可以方便的进行中文OCR了。
1.1、首先到Tesseract项目主页下载命令行工具、源代码、中文语言包:
1.2、命令行工具解压缩后如下(不含1.jpg、1.txt):
1.3、为了进行中文OCR,将简体中文语言包复制到【tessdata】目录下:
1.4、在DOS下切换到Tesseract的命令行目录,查看一下tesseract.exe的命令格式:
Imagename为待OCR的图片,outputbase为OCR后的输出文件,默认是文本文件(.txt),lang为使用的语言包,configfile为配置文件。
1.5、下面来测试一下,准备一张jpg格式的图片,这里我是放到了和Tesseract同一个目录中:
输入:tesseract.exe 1.jpg 1 -l chi_sim,然后回车,几秒钟就OCR完成了:
这里注意命令的格式:imagename要加上扩展名.jpg,输出文件和语言包不需要加扩展名。
OCR结果:
可以看到结果不是很理想,中文识别还说的过去,但是英文、数字大都乱码。不过作为老牌的OCR引擎,能做到这种程度已经相当不错了,期待Google的后续升级吧,支持一下。
2、使用WPF封装Tesseract命令行
2.1、鉴于命令行书写容易出错,且对最终用户很不友好,我做了一个简单的WPF小程序,将Tesseract的命令行封装了一下:
左边选择图片、预览,右边选择输出目录,显示OCR结果,支持本地及网络图片的预览。
2.2、为了使得图片预览支持缩放、移动,原本打算使用微软的Zoom It API,可惜不支持WPF,于是使用了一个第三方的类:
1 图片缩放、移动工具类Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->using System; 2 using System.Windows.Controls; 3 using System.Windows.Input; 4 using System.Windows.Media.Animation; 5 using System.Windows; 6 using System.Windows.Media; 7 8 namespace PanAndZoom 9 { 10 public class PanAndZoomViewer : ContentControl 11 { 12 public double DefaultZoomFactor { get; set; } 13 private FrameworkElement source; 14 private Point ScreenStartPoint = new Point(0, 0); 15 private TranslateTransform translateTransform; 16 private ScaleTransform zoomTransform; 17 private TransformGroup transformGroup; 18 private Point startOffset; 19 20 public PanAndZoomViewer() 21 { 22 this.DefaultZoomFactor = 1.4; 23 } 24 25 public override void OnApplyTemplate() 26 { 27 base.OnApplyTemplate(); 28 Setup(this); 29 } 30 31 void Setup(FrameworkElement control) 32 { 33 this.source = VisualTreeHelper.GetChild(this, 0) as FrameworkElement; 34 35 this.translateTransform = new TranslateTransform(); 36 this.zoomTransform = new ScaleTransform(); 37 this.transformGroup = new TransformGroup(); 38 this.transformGroup.Children.Add(this.zoomTransform); 39 this.transformGroup.Children.Add(this.translateTransform); 40 this.source.RenderTransform = this.transformGroup; 41 this.Focusable = true; 42 this.KeyDown += new KeyEventHandler(source_KeyDown); 43 this.MouseMove += new MouseEventHandler(control_MouseMove); 44 this.MouseDown += new MouseButtonEventHandler(source_MouseDown); 45 this.MouseUp += new MouseButtonEventHandler(source_MouseUp); 46 this.MouseWheel += new MouseWheelEventHandler(source_MouseWheel); 47 } 48 49 void source_KeyDown(object sender, KeyEventArgs e) 50 { 51 // hit escape to reset everything 52 if (e.Key == Key.Escape) Reset(); 53 } 54 55 void source_MouseWheel(object sender, MouseWheelEventArgs e) 56 { 57 // zoom into the content. Calculate the zoom factor based on the direction of the mouse wheel. 58 double zoomFactor = this.DefaultZoomFactor; 59 if (e.Delta <= 0) zoomFactor = 1.0 / this.DefaultZoomFactor; 60 // DoZoom requires both the logical and physical location of the mouse pointer 61 var physicalPoint = e.GetPosition(this); 62 DoZoom(zoomFactor, this.transformGroup.Inverse.Transform(physicalPoint), physicalPoint); 63 64 } 65 66 void source_MouseUp(object sender, MouseButtonEventArgs e) 67 { 68 if (this.IsMouseCaptured) 69 { 70 // we‘re done. reset the cursor and release the mouse pointer 71 this.Cursor = Cursors.Arrow; 72 this.ReleaseMouseCapture(); 73 } 74 } 75 76 void source_MouseDown(object sender, MouseButtonEventArgs e) 77 { 78 // Save starting point, used later when determining how much to scroll. 79 this.ScreenStartPoint = e.GetPosition(this); 80 this.startOffset = new Point(this.translateTransform.X, this.translateTransform.Y); 81 this.CaptureMouse(); 82 this.Cursor = Cursors.ScrollAll; 83 } 84 85 86 void control_MouseMove(object sender, MouseEventArgs e) 87 { 88 if (this.IsMouseCaptured) 89 { 90 // if the mouse is captured then move the content by changing the translate transform. 91 // use the Pan Animation to animate to the new location based on the delta between the 92 // starting point of the mouse and the current point. 93 var physicalPoint = e.GetPosition(this); 94 this.translateTransform.BeginAnimation(TranslateTransform.XProperty, CreatePanAnimation(physicalPoint.X - this.ScreenStartPoint.X + this.startOffset.X), HandoffBehavior.Compose); 95 this.translateTransform.BeginAnimation(TranslateTransform.YProperty, CreatePanAnimation(physicalPoint.Y - this.ScreenStartPoint.Y + this.startOffset.Y), HandoffBehavior.Compose); 96 } 97 } 98 99 100 /// <summary>Helper to create the panning animation for x,y coordinates.</summary> 101 /// <param name="toValue">New value of the coordinate.</param> 102 /// <returns>Double animation</returns> 103 private DoubleAnimation CreatePanAnimation(double toValue) 104 { 105 var da = new DoubleAnimation(toValue, new Duration(TimeSpan.FromMilliseconds(300))); 106 da.AccelerationRatio = 0.1; 107 da.DecelerationRatio = 0.9; 108 da.FillBehavior = FillBehavior.HoldEnd; 109 da.Freeze(); 110 return da; 111 } 112 113 114 /// <summary>Helper to create the zoom double animation for scaling.</summary> 115 /// <param name="toValue">Value to animate to.</param> 116 /// <returns>Double animation.</returns> 117 private DoubleAnimation CreateZoomAnimation(double toValue) 118 { 119 var da = new DoubleAnimation(toValue, new Duration(TimeSpan.FromMilliseconds(500))); 120 da.AccelerationRatio = 0.1; 121 da.DecelerationRatio = 0.9; 122 da.FillBehavior = FillBehavior.HoldEnd; 123 da.Freeze(); 124 return da; 125 } 126 127 /// <summary>Zoom into or out of the content.</summary> 128 /// <param name="deltaZoom">Factor to mutliply the zoom level by. </param> 129 /// <param name="mousePosition">Logical mouse position relative to the original content.</param> 130 /// <param name="physicalPosition">Actual mouse position on the screen (relative to the parent window)</param> 131 public void DoZoom(double deltaZoom, Point mousePosition, Point physicalPosition) 132 { 133 double currentZoom = this.zoomTransform.ScaleX; 134 currentZoom *= deltaZoom; 135 this.translateTransform.BeginAnimation(TranslateTransform.XProperty, CreateZoomAnimation(-1 * (mousePosition.X * currentZoom - physicalPosition.X))); 136 this.translateTransform.BeginAnimation(TranslateTransform.YProperty, CreateZoomAnimation(-1 * (mousePosition.Y * currentZoom - physicalPosition.Y))); 137 this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty, CreateZoomAnimation(currentZoom)); 138 this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty, CreateZoomAnimation(currentZoom)); 139 } 140 141 /// <summary>Reset to default zoom level and centered content.</summary> 142 public void Reset() 143 { 144 this.translateTransform.BeginAnimation(TranslateTransform.XProperty, CreateZoomAnimation(0)); 145 this.translateTransform.BeginAnimation(TranslateTransform.YProperty, CreateZoomAnimation(0)); 146 this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty, CreateZoomAnimation(1)); 147 this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty, CreateZoomAnimation(1)); 148 } 149 } 150 }
2.3、除了使用鼠标。还可以使用滚动条调节图片预览效果:
1 数据绑定Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/--> <WrapPanel Grid.Row="2" Grid.Column="0"> 2 <Label Name="lab长度" Content="长度:" Margin="3" /> 3 <Slider Name="sl长度" MinWidth="50" Margin="3" VerticalAlignment="Center" Maximum="400" Value="{Binding ElementName=img图片, Path=Width, Mode=TwoWay}" /> 4 5 <Label Name="lab宽度" Content="宽度:" Margin="3" /> 6 <Slider Name="sl宽度" MinWidth="50" Margin="3" VerticalAlignment="Center" Maximum="400" Value="{Binding ElementName=img图片, Path=Height, Mode=TwoWay}" /> 7 8 <Label Name="lab透明度" Content="透明度:" Margin="3" /> 9 <Slider Name="sl透明度" MinWidth="50" Margin="3" VerticalAlignment="Center" Maximum="1" Value="{Binding ElementName=img图片, Path=Opacity, Mode=TwoWay}" /> 10 11 <Label Name="lab拉伸方式" Content="拉伸方式:" Margin="3" /> 12 <ComboBox Name="txt拉伸方式" Margin="3" MinWidth="85"> 13 <ComboBoxItem Content="Fill" /> 14 <ComboBoxItem Content="None" IsSelected="True" /> 15 <ComboBoxItem Content="Uniform" /> 16 <ComboBoxItem Content="UniformToFill" /> 17 </ComboBox> 18 </WrapPanel> 19 20 <local:PanAndZoomViewer Grid.Row="3" Grid.Column="0" Height="300" Margin="3"> 21 <Image Name="img图片" Stretch="{Binding ElementName=txt拉伸方式, Path=Text, Mode=TwoWay}" /> 22 </local:PanAndZoomViewer>
2.4、由于Tesseract命令行不支持直接OCR网络图片,故先下载:
1 图片下载Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/--> private void fnStartDownload(string v_strImgPath, string v_strOutputDir, out string v_strTmpPath) 2 { 3 int n = v_strImgPath.LastIndexOf(‘/‘); 4 string URLAddress = v_strImgPath.Substring(0, n); 5 string fileName = v_strImgPath.Substring(n + 1, v_strImgPath.Length - n - 1); 6 this.__OutputFileName = v_strOutputDir + "\\" + fileName.Substring(0, fileName.LastIndexOf(".")); 7 8 if (!Directory.Exists(System.Configuration.ConfigurationManager.AppSettings["tmpPath"])) 9 { 10 Directory.CreateDirectory(System.Configuration.ConfigurationManager.AppSettings["tmpPath"]); 11 } 12 13 string Dir = System.Configuration.ConfigurationManager.AppSettings["tmpPath"]; 14 v_strTmpPath = Dir + "\\" + fileName; 15 16 WebRequest myre = WebRequest.Create(URLAddress); 17 client.DownloadFile(v_strImgPath, v_strTmpPath); 18 19 20 //Stream str = client.OpenRead(v_strImgPath); 21 //StreamReader reader = new StreamReader(str); 22 //byte[] mbyte = new byte[Int32.Parse(System.Configuration.ConfigurationManager.AppSettings["MaxDownloadImgLength"])]; 23 //int allmybyte = (int)mbyte.Length; 24 //int startmbyte = 0; 25 //while (allmybyte > 0) 26 //{ 27 // int m = str.Read(mbyte, startmbyte, allmybyte); 28 // if (m == 0) 29 // { 30 // break; 31 // } 32 33 // startmbyte += m; 34 // allmybyte -= m; 35 //} 36 37 //FileStream fstr = new FileStream(v_strTmpPath, FileMode.Create, FileAccess.Write); 38 //fstr.Write(mbyte, 0, startmbyte); 39 //str.Close(); 40 //fstr.Close(); 41 }
2.5、使用Process来调用Tesseract命令行:
1 调用Tesseract命令行Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/--> private void fnOCR(string v_strTesseractPath, string v_strSourceImgPath, string v_strOutputPath, string v_strLangPath) 2 { 3 using (Process process = new System.Diagnostics.Process()) 4 { 5 process.StartInfo.FileName = v_strTesseractPath; 6 process.StartInfo.Arguments = v_strSourceImgPath + " " + v_strOutputPath + " -l " + v_strLangPath; 7 process.StartInfo.UseShellExecute = false; 8 process.StartInfo.CreateNoWindow = true; 9 process.StartInfo.RedirectStandardOutput = true; 10 process.Start(); 11 process.WaitForExit(); 12 } 13 }
2.6、测试本地图片:
2.7、测试网络图片:
小结:
本次我们简单讨论了下Tesseract的用法,作为一款开源、免费的OCR引擎,能够支持中文十分难得。虽然其识别效果不是很理想,但是对于要求不高的中小型项目来说,已经足够用了。这里有一份免费OCR工具列表,感兴趣的朋友可以研究一下。下一次将测试一下Onenote 2010中OCR功能,以及如何调用其API,为项目所用。
标签:style blog http io color ar os 使用 for
原文地址:http://www.cnblogs.com/shiddong/p/4079023.html