Skip to content

Commit b684a75

Browse files
authored
Merge pull request #295 from 3ace/us-1183-ocr-code-examples
[US-1183] OCR service code examples
2 parents 47ee72d + 65a16d4 commit b684a75

File tree

8 files changed

+770
-0
lines changed

8 files changed

+770
-0
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,19 @@ a pull request.
77
While the majority of examples are fully in pure Go, there are a few examples that demonstrate additional
88
functionality that requires CGO and external dependencies. Those examples are clarified by filename suffix "_cgo.go".
99

10+
## Disclaimer
11+
12+
**IMPORTANT:** The code examples provided in this repository are for educational and demonstration purposes only. They are provided "as is" without warranty of any kind, either express or implied. These examples are intended to help developers understand how to use the UniPDF library and may not be suitable for production environments without additional security hardening, error handling, and testing.
13+
14+
UniDoc (the maintainers of this repository) shall not be held responsible for any risks, damages, or issues arising from the use of these code examples in production or any other environment. Users are solely responsible for reviewing, testing, and adapting the code to meet their specific requirements and security standards before deploying to production systems.
15+
16+
It is strongly recommended that you:
17+
- Conduct thorough security reviews and testing
18+
- Implement proper input validation and sanitization
19+
- Add appropriate error handling and logging
20+
- Follow security best practices for your specific use case
21+
- Consult with security professionals when handling sensitive data
22+
1023
## License codes
1124
UniPDF requires license codes to operate, there are two options:
1225
- Metered License API keys: Free ones can be obtained at https://cloud.unidoc.io

go.mod

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,12 @@ go 1.23.0
55
require (
66
cloud.google.com/go/kms v1.18.5
77
github.com/ThalesIgnite/crypto11 v1.2.5
8+
github.com/anthonynsimon/bild v0.14.0
89
github.com/aws/aws-sdk-go v1.55.6
910
github.com/bmatcuk/doublestar v1.3.4
1011
github.com/boombuler/barcode v1.0.2
1112
github.com/gabriel-vasile/mimetype v1.4.8
13+
github.com/stefanhengl/gohocr v0.0.0-20171024154250-dde96807b100
1214
github.com/trimmer-io/go-xmp v1.0.0
1315
github.com/unidoc/globalsign-dss v0.0.0-20220330092912-b69d85b63736
1416
github.com/unidoc/pkcs7 v0.3.0

go.sum

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ github.com/adrg/sysfont v0.1.2/go.mod h1:6d3l7/BSjX9VaeXWJt9fcrftFaD/t7l11xgSywC
2424
github.com/adrg/xdg v0.3.0/go.mod h1:7I2hH/IT30IsupOpKZ5ue7/qNi3CoKzD6tL3HwpaRMQ=
2525
github.com/adrg/xdg v0.5.3 h1:xRnxJXne7+oWDatRhR1JLnvuccuIeCoBu2rtuLqQB78=
2626
github.com/adrg/xdg v0.5.3/go.mod h1:nlTsY+NNiCBGCK2tpm09vRqfVzrc2fLmXGpBLF0zlTQ=
27+
github.com/anthonynsimon/bild v0.14.0 h1:IFRkmKdNdqmexXHfEU7rPlAmdUZ8BDZEGtGHDnGWync=
28+
github.com/anthonynsimon/bild v0.14.0/go.mod h1:hcvEAyBjTW69qkKJTfpcDQ83sSZHxwOunsseDfeQhUs=
2729
github.com/aws/aws-sdk-go v1.55.6 h1:cSg4pvZ3m8dgYcgqB97MrcdjUmZ1BeMYKUxMMB89IPk=
2830
github.com/aws/aws-sdk-go v1.55.6/go.mod h1:eRwEWoyTWFMVYVQzKMNHWP5/RV4xIUGMQfXQHfHkpNU=
2931
github.com/bmatcuk/doublestar v1.3.4 h1:gPypJ5xD31uhX6Tf54sDPUOBXTqKH4c9aPY66CyQrS0=
@@ -109,6 +111,8 @@ github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZN
109111
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4/go.mod h1:xMI15A0UPsDsEKsMN9yxemIoYk6Tm2C1GtYGdfGttqA=
110112
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
111113
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
114+
github.com/stefanhengl/gohocr v0.0.0-20171024154250-dde96807b100 h1:hF/ZvwhZFjvAXLTKinLJZwFf7ajPZp+LUyDc+qtoVzM=
115+
github.com/stefanhengl/gohocr v0.0.0-20171024154250-dde96807b100/go.mod h1:cPiGn9y/mCPkH6dScOMVru1KnTdtzh/2DvvJrFDS7Sc=
112116
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
113117
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
114118
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=

ocr/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# PDF OCR Examples
2+
3+
UniPDF supports integration with HTTP-based OCR (Optical Character Recognition) services to extract text from images and scanned PDF documents. These examples demonstrate how to configure and use OCR services to process images and reconstruct searchable PDFs from scanned documents.
4+
5+
The OCR functionality works by sending images to a configured HTTP endpoint that performs text recognition and returns the results in various formats including plain text and HOCR (HTML-based OCR format).
6+
7+
## Examples
8+
9+
- [hocr_sample.go](hocr_sample.go) illustrates how to process HOCR formatted OCR output, parsing word-level information including bounding boxes and confidence scores.
10+
- [ocr_batch.go](ocr_batch.go) shows how to perform batch OCR processing on multiple images concurrently, with error handling and summary reporting.
11+
- [ocr_sample.go](ocr_sample.go) demonstrates basic OCR usage by sending a single image to an HTTP OCR service and extracting the text content.
12+
- [reconstruct_pdf_from_hocr.go](reconstruct_pdf_from_hocr.go) demonstrates a complete workflow to extract images from a PDF, perform OCR with HOCR output, parse the structured results, and reconstruct a searchable PDF with properly positioned text.
13+
14+
## Requirements
15+
16+
These examples require an HTTP OCR service running on `http://localhost:8080/file`. The examples are created using [unidoc/ocrserver](https://github.com/unidoc/ocrserver) as the OCR service. However, UniPDF's OCR API is designed to be flexible and should support other OCR services that accept image uploads via multipart form data and return text or HOCR formatted results.

ocr/hocr_sample.go

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
/**
2+
* This is a sample Go program that demonstrates how to use the UniPDF library
3+
* to perform OCR on an image using an HTTP OCR service that returns HOCR formatted
4+
* output. The program parses the HOCR response and extracts word-level information
5+
* including bounding boxes and confidence scores.
6+
*
7+
* This example uses https://github.com/unidoc/ocrserver as the OCR service.
8+
* However, UniPDF's OCR API is designed to support other OCR services that accept
9+
* image uploads via HTTP and return text or HOCR formatted results.
10+
*
11+
* Run as: go run hocr_sample.go input.jpg
12+
*/
13+
package main
14+
15+
import (
16+
"context"
17+
"encoding/json"
18+
"fmt"
19+
"os"
20+
"strconv"
21+
22+
"github.com/stefanhengl/gohocr"
23+
"github.com/unidoc/unipdf/v4/common/license"
24+
"github.com/unidoc/unipdf/v4/ocr"
25+
)
26+
27+
func init() {
28+
// Make sure to load your metered License API key prior to using the library.
29+
// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
30+
err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
31+
if err != nil {
32+
panic(err)
33+
}
34+
}
35+
36+
func main() {
37+
if len(os.Args) < 2 {
38+
fmt.Printf("Usage: go run hocr_sample.go input.jpg\n")
39+
os.Exit(1)
40+
}
41+
42+
f, err := os.Open(os.Args[1])
43+
if err != nil {
44+
fmt.Printf("Error opening file: %v\n", err)
45+
os.Exit(1)
46+
}
47+
defer f.Close()
48+
49+
// Configure OCR service options.
50+
opts := ocr.OCROptions{
51+
Url: "http://localhost:8080/file",
52+
Method: "POST",
53+
FileFieldName: "file",
54+
Headers: map[string]string{
55+
"Accept": "application/json",
56+
},
57+
FormFields: map[string]string{
58+
"format": "hocr",
59+
},
60+
TimeoutSeconds: 30,
61+
}
62+
63+
// Create OCR client.
64+
client := ocr.NewHTTPOCRService(opts)
65+
66+
result, err := client.ExtractText(context.Background(), f, "image.jpg")
67+
if err != nil {
68+
fmt.Printf("Error extracting text: %v\n", err)
69+
os.Exit(1)
70+
}
71+
72+
// Parse JSON response to extract the "result" field.
73+
var jsonObj map[string]interface{}
74+
if err := json.Unmarshal(result, &jsonObj); err != nil {
75+
fmt.Printf("Error parsing JSON response: %v\n", err)
76+
os.Exit(1)
77+
}
78+
79+
content, ok := jsonObj["result"].(string)
80+
if !ok {
81+
fmt.Printf("Error: result field is not a string\n")
82+
os.Exit(1)
83+
}
84+
fmt.Printf("Extracted text: %s\n", content)
85+
86+
content, err = strconv.Unquote(content)
87+
if err != nil {
88+
fmt.Printf("Error unquoting content: %v\n", err)
89+
os.Exit(1)
90+
}
91+
92+
contentBytes := []byte(content)
93+
94+
data, err := gohocr.Parse(contentBytes)
95+
if err != nil {
96+
fmt.Printf("Error parsing HOCR data: %v\n", err)
97+
os.Exit(1)
98+
}
99+
100+
for _, v := range data.Words {
101+
fmt.Printf("Word: %s, Title: %f\n", v.Content, v.Title)
102+
}
103+
}

ocr/ocr_batch.go

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
/**
2+
* This is a sample Go program that demonstrates how to use the UniPDF library
3+
* to perform batch OCR processing on multiple images using an HTTP OCR service.
4+
* The program processes multiple image files concurrently and displays the extracted
5+
* text results along with a summary of successful and failed operations.
6+
*
7+
* This example uses https://github.com/unidoc/ocrserver as the OCR service.
8+
* However, UniPDF's OCR API is designed to support other OCR services that accept
9+
* image uploads via HTTP and return text or HOCR formatted results.
10+
*
11+
* Run as: go run ocr_batch.go image1.jpg image2.png [image3.jpg ...]
12+
*/
13+
package main
14+
15+
import (
16+
"context"
17+
"fmt"
18+
"os"
19+
"path/filepath"
20+
21+
"github.com/unidoc/unipdf/v4/common/license"
22+
"github.com/unidoc/unipdf/v4/ocr"
23+
)
24+
25+
func init() {
26+
// Make sure to load your metered License API key prior to using the library.
27+
// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
28+
err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
29+
if err != nil {
30+
panic(err)
31+
}
32+
}
33+
34+
func main() {
35+
if len(os.Args) < 2 {
36+
fmt.Printf("Usage: go run ocr_batch.go image1.jpg image2.png [image3.jpg ...]\n")
37+
os.Exit(1)
38+
}
39+
40+
// Get list of image files from command line arguments
41+
filePaths := os.Args[1:]
42+
43+
// Validate that all files exist
44+
for _, filePath := range filePaths {
45+
if _, err := os.Stat(filePath); os.IsNotExist(err) {
46+
fmt.Printf("Error: File does not exist: %s\n", filePath)
47+
os.Exit(1)
48+
}
49+
}
50+
51+
// Configure OCR service options.
52+
opts := ocr.OCROptions{
53+
Url: "http://localhost:8080/file",
54+
Method: "POST",
55+
FileFieldName: "file",
56+
Headers: map[string]string{
57+
"Accept": "application/json",
58+
},
59+
TimeoutSeconds: 30,
60+
}
61+
62+
// Create OCR client.
63+
client := ocr.NewOCRHTTPClient(opts)
64+
65+
fmt.Printf("Processing %d files...\n", len(filePaths))
66+
67+
// Batch process files.
68+
results, errors := client.BatchProcessFiles(context.Background(), filePaths)
69+
70+
// Display results
71+
for i, filePath := range filePaths {
72+
filename := filepath.Base(filePath)
73+
fmt.Printf("\n--- Results for %s ---\n", filename)
74+
75+
if errors[i] != nil {
76+
fmt.Printf("Error processing %s: %s\n", filename, errors[i])
77+
continue
78+
}
79+
80+
fmt.Printf("Extracted text from %s:\n%s\n", filename, string(results[i]))
81+
}
82+
83+
// Summary
84+
successCount := 0
85+
errorCount := 0
86+
for _, err := range errors {
87+
if err != nil {
88+
errorCount++
89+
} else {
90+
successCount++
91+
}
92+
}
93+
94+
fmt.Printf("\n--- Summary ---\n")
95+
fmt.Printf("Successfully processed: %d files\n", successCount)
96+
fmt.Printf("Failed to process: %d files\n", errorCount)
97+
fmt.Printf("Total files: %d\n", len(filePaths))
98+
}

ocr/ocr_sample.go

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
/**
2+
* This is a sample Go program that demonstrates how to use the UniPDF library
3+
* to perform OCR on an image using an HTTP OCR service. The program sends an image
4+
* to the configured OCR endpoint and displays the extracted text.
5+
*
6+
* This example uses https://github.com/unidoc/ocrserver as the OCR service.
7+
* However, UniPDF's OCR API is designed to support other OCR services that accept
8+
* image uploads via HTTP and return text or HOCR formatted results.
9+
*
10+
* Run as: go run ocr_sample.go input.jpg
11+
*/
12+
package main
13+
14+
import (
15+
"context"
16+
"fmt"
17+
"os"
18+
19+
"github.com/unidoc/unipdf/v4/common/license"
20+
"github.com/unidoc/unipdf/v4/ocr"
21+
)
22+
23+
func init() {
24+
// Make sure to load your metered License API key prior to using the library.
25+
// If you need a key, you can sign up and create a free one at https://cloud.unidoc.io
26+
err := license.SetMeteredKey(os.Getenv(`UNIDOC_LICENSE_API_KEY`))
27+
if err != nil {
28+
panic(err)
29+
}
30+
}
31+
32+
func main() {
33+
if len(os.Args) < 2 {
34+
fmt.Printf("Usage: go run ocr_sample.go input.jpg\n")
35+
os.Exit(1)
36+
}
37+
38+
f, err := os.Open(os.Args[1])
39+
if err != nil {
40+
fmt.Printf("Error opening file: %v\n", err)
41+
os.Exit(1)
42+
}
43+
defer f.Close()
44+
45+
// Configure OCR service options.
46+
opts := ocr.OCROptions{
47+
Url: "http://localhost:8080/file",
48+
Method: "POST",
49+
FileFieldName: "file",
50+
Headers: map[string]string{
51+
"Accept": "application/json",
52+
},
53+
TimeoutSeconds: 30,
54+
}
55+
56+
// Create OCR client.
57+
client := ocr.NewHTTPOCRService(opts)
58+
59+
result, err := client.ExtractText(context.Background(), f, "image.jpg")
60+
if err != nil {
61+
fmt.Printf("Error extracting text: %v\n", err)
62+
os.Exit(1)
63+
}
64+
65+
fmt.Printf("Extracted text: %s\n", string(result))
66+
}

0 commit comments

Comments
 (0)