Good Deeds, Done Dirt Cheap

Author

Eric Brown-Munoz (with no collaborators)

Abstract

A system for transcribing scanned documents (such as deeds). It will handle requests from a customer, store the information about the deed and transcriptions, and handle billing and payment for the typist.

What

This is a system to connect a customer, who needs to turn images into text to quickly and easily, with a typist who will be paid fairly for their work.

Each project will start in the form of a request (in an XML format of course) which will specify the rate the customer is offering, instructions on the job, and a list of links to images which need to be typed. This request may contain fields(i.e. a form) which need to be filled out by the typist.

A typist will select a job, agree on the rate and then type the text from the images which will be stored into an XML database.

The customer will first receive a non-text form of the work (i.e. a non-copiable pdf file) for his or her approval. When the work is accepted he will receive a download in xml form (or text or html as transformed by optional xslt files).

When the completed job is downloaded, the system will generate a bill for the customer and a pay stub for the typist (minus a small cut for a hungry developer/web site manager).

Why

The idea for this project came from my brother who is a real estate investor. He noted that deeds, and other useful public data are being put on the internet in a scanned form. This is mildly useful, but a searchable text form is far more useful.

It seems to me there is plenty of cheap labor to do this kind of thing, and a public website to farm out this work may be a good idea.

How

For the backend I would like to use Apache Xindice database. I will use this for storing business data (i.e. invoices and payments to typists), as well as data on the images (i.e. the transcribed text). I am planning a middle layer which will be in J2EE with classes to access the database

The presentation layer will be JSP's which will allow both customers and typists to interact with the system from their browsers. I will generate a "preview" of the work that will allow the customer to see the finished work in a non-useful form before paying using XSL-FO to render an immage.

I don't mind submitting this project on nice. (I do plan to develop on a personal machine.)

Questions

None at this time.