Extract Chinese text from PDF(C#)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Extract Chinese text from PDF(C#)

wang
Dear all,

I need to write a program to extract Chinese  text from pdf files. I use PdfTextExtractor.GetTextFromPage() function to extract text, it works perfectly for English text. But when I use the same piece of code to extract Chinese   content, it return messy code. I think this is the problem of extract CID or Unicode font. Thank you.