IRS Prototype Audit Support Tool


William B. Trautman


The purpose of the project is to develop a prototype audit support tool for the international division of the IRS.


The tool would grab a tax return in the form of an XML document from a Microsoft SQL Server database, parse the relevant parts of the document into a Java taxpayer object, render the object in a tree that allows the user to navigate around the hierarchical data, and perform basic risk computations on the data. The tool may also demonstrate the possibility of integrating related taxpaying entities into a broader Java enterprise object.


Large business taxpayers have recently begun to file their tax returns electronically in XML format, and the IRS is struggling with capturing and using the data. In fact, it is spending a significant amount of resources extracting the XML data into hundreds of flat file tables in a Microsoft SQL Server database and implicitly stripping the data of its embedded hierarchy. The purpose of this project is to show that it is possible to parse the data into a Java object, transparently incorporate related enterprises into an overall enterprise data structure, preserve the hierarchical organizational structure, and perform basic risk analyses within that framework.


I plan to establish a JDBC connection to connect to a local Microsoft SQL Server database and grab some fake (for the purposes of this project) taxpayer data which reside there in XML format. I will then use a SAX parser to parse the data into a Java taxpayer object designed to represent the taxpayer. I will also use a SAX parser to parse the data associated with a related entity and combine information from the two data sources into a larger "economic enterprise" Java object. I will then render the object using a JTree, which will allow the user to navigate around the data. Depending on the data and stylesheets that are available, I may render tax return page images. I will also create a menu driven ability to perform simple risk analyses on the object.


I assume that there is no Microsoft SQL Server database available to the CSCI E-259 staff. I could read in the XML data from a file. I would prefer, however, to set up the connection to the database since all of our XML data are housed in such a database. Would it be possible to provide a live demonstration of the tool at the completion of the project, or should I just read the XML data in from a file?