The functionality of pqScan .NET PDF to Text Extractor is similar to OCR technology, which is easily be used for text recognition from PDF in C#. The example below explains how to use C# class code to get text from PDF file page(s) in Visual Studio .NET program. C# developers can quickly extract text from one page, a few pages, and all pages of PDF document.
using System; using System.Text; using PQScan.PDFToText; namespace PDF2Text { class Program { static void Main(string[] args) { // Create an instance of PQScan.PDFToText.PDFExtractor object. PDFExtractor extractor = new PDFExtractor(); // Load a PDF document. extractor.LoadPDF("sample.pdf"); // Get total page count. int count = extractor.PageCount; for (int i = 0; i < count; i++) { // Extract text from each PDF file page. string pageText = extractor.ToText(i); Console.WriteLine(pageText); } // Extract text from whole PDF document. string totalText = extractor.ToText(); Console.WriteLine(totalText); } } }
PDF to Text File Conversion - C# Example
Our .NET PDF to Text Converter Software also allows users to convert PDF to text file without losing formatting using C# code. Please directly copy free example below to extract text from whole PDF and save it to text file.
using System; using System.Text; using PQScan.PDFToText; namespace PDF2TextFile { class Program { static void Main(string[] args) { // Create an instance of PQScan.PDFToText.PDFExtractor object. PDFExtractor extractor = new PDFExtractor(); // Load a PDF file. extractor.LoadPDF("sample.pdf"); // Convert whole PDF text to txt file. extractor.ToTextFile("output-text.txt"); } } }