Converting Legislative Databases into More Easily Accessible Formats
The client is a legislative drafting office based in New South Wales, Australia. It drafts all the bills required for introduction into parliament and also for a wide range of subordinate legislations, including regulations, rules, proclamations, and orders. It also provides the federal government with an integrated range of services, such as publishing legislations and providing advice on various legislations from time to time.
The client worked with its preferred vendors for conversion and XML coding of its WordPerfect 5.1 repository comprising legislations. The business requisite was to convert WordPerfect documents into normal text documents by applying XML coding using text documents to validate documents using Cascading Style Sheets (CSS).
Challenges
- Over 90 MB of legislative data had to be converted by using various DTDs—ACT.DTD, SUBORDLEG.DTD, LEGHISTORY.DTD, and INDEX.DTD
- The input documents (in WordPerfect 5.1 format) had to be converted into SGML files in accordance with the XML-compliant DTD
- The data included tables of varying complexity, graphics, equations, and forms
- The database contained a number of acts and statutory instruments formatted in the client’s old typographic style
- It was essential that the changes did not result in accidental alterations of the text and conform to the new design
Solution and Approach
Lumina Datamatics proposed a proof of concept (POC) in this project.
- Indigenous development of a PERL script to convert RTF codes into SGML tags
- Employment of a dedicated team well-versed in understanding DTD and client requisites
- The successful delivery of XML conversion with an output of 10 MB every month
The following approach was set down:
- The quality checklist was updated per the latest specifications and feedback from the client on previous batches
- The files were then browsed in an XMetal browser for browsing tables and forms
- Indentation was carried out on all the files in every batch with a program tailor-made for this project
- The final files were zipped per the required directory structure and were uploaded on to the client’s FTP server
- Initially 10 sample files of legislation were processed, and subsequently batches of approximately 200 files were worked on
- Data in WordPerfect 5.1 format was converted into rich-text formats using MS Word
- The work flow for the XML conversion project was designed to incorporate the recurrent modifications, taking into account the complexity of the process