- Convert PDF to plain text in .NET WinForms and ASP.NET web projects
- Extract text from one page or several pages of PDF document.
- Convert whole PDF document to a txt file.
- Extracted text will keep the original PDF file layout.
- Perfectly support western languages extracting, like English, German and Spanish.
- No need for Adobe Reader or any other toolkits.
- Comply with .NET Framework 2.0 to 4.5 versions.
- Support 32-bit and 64-bit operating systems.
How to Extract Text from PDF in .NET
Compared with other popular image and document file formats, PDF is freely to use and ensure more safety. But sometimes, if readers are interested in some content on PDF document, it may be a little difficult to get the content out. That is because you cannot simply copy PDF content and past it out with original layout. Fortunately, you can find a .NET solution on this online tutorial. And detailed .NET APIs and programming examples are illustrated in the following parts.
pqScan .NET PDF to Text Extractor DLL can be easily used to extract text from PDF file and convert PDF to txt file. It is a standalone library component developed in .NET Framework and can be easily used in any type of .NET application. Both Visual C# & Visual Basic .NET programming languages are supported.
API for How to Load PDF
.NET developers may feel free to load PDF document from local PDF file and PDF stream by using following APIs. Both file stream and memory stream are supported.
public bool LoadPDF(string fileName); public bool LoadPDF(Stream stream);
API for Total Page Count
.NET developers are allowed to decide which page to extract instead of extracting text from whole PDF document.
public int PageCount;
API for How to Get Output Text
.NET programmers can get String object by implementing text extraction from PDF. This is available for both custom page and whole document extraction.
public String ToText(int pageIndex); public String ToText();
API for How to Get Output Text File
pqScan .NET PDF to Text Extractor SDK also supplies API for converting and rendering PDF document to a txt file.
public void ToTextFile(string fileName);
Free .NET programming examples for text extraction from PDF file are provided online.