Karnataka Public Service Commission (KPSC) is the state level body that conducts examinations for the government services at various departments of the State. At all the different levels of examinations – starting from the clerical or below, to the managerial level – the applications received are much larger as compared to the actual vacancies. The number of applications received for each of these examinations is too large for the manual system to work efficiently, be it on the part of processing those applications, to conducting examination, or final publications of results. Moreover, the manual system leaves larger room for corruption, in terms of data manipulation, thereby reducing the credibility on the commission. Subsequently, KPSC used Intelligent Character Recognition (ICR) technology for processing the application forms for the KPSC examinations and Optical Magnetic Recognition (OMR) technology along with barcode for designing the answer scripts. Comat Technologies has provided the technology. The Government of Karnataka has also used the same ICR technology for designing one of the survey forms. Under the Sarva Shiksha Abhiyan program of the Government of India, each State is required detailed data on school children at the primary educati on level along with the background information for providing relevant universal primary education to the children. This data has been collected by the respective States for further entry into the
ICR converts hand printed characters to their machine print (ASCII) equivalents, representing a significant step forward in technology as compared to the older OCR systems, which only reads machine print. The ability to recognise handprint significantly broadens the range of applications that benefit from automated ICR solutions leading to saving time and increasing accuracy to levels not attainable by OCR or human intervention. ICR includes the added benefit of developing a level of confidence in each character result, where confidence is defined as the ability to report on itself, making a judgment about the accuracy of its recognition. The characters that ICR considers unreliable are sent to human operators for double-checking.
“The use of simple technologies like Intelligent Character Recognition and Optical Magnetic Recognition could lead to more efficient system in terms of less time, higher productivity and cost effectiveness, thus leading to better lives to the people through improved governance”
Hand printed characters are created by humans, so understanding and interpreting the patterns of human writing is far more complicated than converting simple machine print. Like OCR engines, ICR engines execute recognition character-by-character and start by segmenting words into their component characters. ICR technology in fact recognises separate words or word combinations such as form fields, and letters cannot be written sloppily or stuck together. People read text by scanning entire words, not individual characters. ICR systems, like the most advanced OCR systems, try to imitate the human approach. They use dictionaries that contain possible field values, facilitating word recognition by combining primary recognition results with alternate choices, and then analysing available alternatives.
OMR is generally distinguished from optical character recognition by the fact that a recognition engine is not required, i.e. the marks are constructed in such a way that there is little chance of not reading the marks correctly. This requires the image to have high contrast and an easily recognisable or irrelevant shape.
This technology is useful for applications in which large numbers of hand-filled forms need to be processed quickly and with great accuracy such as surveys, reply cards, questionnaires and ballots. A common OMR application is the use of “bubble sheets” for multiple-choice tests used by schools, which gets fed through an Optical Mark Reader, a device that scans the document and reads the data from the marked fields. The error rate for OMR technology is less than 1%.
KPSC examination and technology application
For Karnataka’s Public Service Commission examination, the Government engaged Comat to carry out the pre-examination and postexamination work for the selection of First Division Assistants (FDA) during the year 2005. Pre-examination works included design, printing and supply of ICR application, processing the filled in ICR application, customised application development, printing of admission ticket, help desk, printing of nominal roll cum attendance sheet etc. The postexamination works included design, printing and supply of OMR answer sheets, processing of OMR sheets, and generation of customised reports etc.
Public Service Commission has been accepting simple paper based application forms so far for all its exams. After receiving the filled-in applications forms along with the photograph of the candidate pasted on the form, the commission used to enter the information into the database manually. Finally, the Commission is required to print the hall tickets for the selected candidates before the exam. For the preparations of hall tickets again, the officials were required to type the data, such as names, addresses of the candidate manually, on the hall tickets and were required to paste the candidates’ photographs once again on ticket. During the examination, the registration numbers were written on the answer booklet to match those with the individual candidate.
The problem with the manual process involves extra cost required to process each application form, including the time taken for typing each of those. It also involves a higher risk of manual errors that, to be controlled correctly, would require adding additional Quality Control steps, time burdens and costs. For preparing the hall tickets, the same information needs to be duplicated from the application forms manually, which leaves a higher chance of error, cost and leaving larger window for corruption as well. Finally, the registration number on the answer booklet makes it easy for the evaluators to identify the candidate, leaving a window for manipulating the outcomes.
Working of ICR, OMR technology
The solution provided to address the above problems worked in two phases – the pre-examination process and postexamination process. For the pre-examination process, the FDA application forms has been designed and printed according to requirements of the department. The application form is in two colours and includes instructions along with the declaration to enable the candidates to fill the application. The acknowledgement part in each application form has been perforated for easy separation. The filled in ICR application forms were scanned at the centres designated by the PSC officials. Application forms were scanned with a speed of 1,000 scans per hour. As a part of processing the application forms, the photographs, signatures and address of the applicants were captured, the accuracy of information such as the date of birth verified, and finally, the centres based on the choice of the candidate were allotted.
One of the most important and efficient steps in pre-examination processing was the generation of admission ticket with each candidate’s photo, and the examination details. The photo and address of the candidate has been extracted from the application form in an automated way. The centre code and the centre name according to the candidate’s choice are printed on the admission ticket. All the admission tickets were dispatched to the candidates one month before the scheduled examination date. As a part of the whole preexamination processing, a Helpdesk was also set up. Images of all the scanned applications, the details of the admission tickets such as centre code, venue code, etc. were available at that single point, along with issuance of duplicate hall tickets. There were well-publicised helplines for the candidates to address all the above issues. The last step in the preexamination processing was printing of the nominal roll cum attendance sheet, which confirms the candidate’s identity with photo. Each sheet had the details of up to 6 candidates along with photographs, attendance, nominal roll, the OMR sheet numbers, question booklet series, subjects attended by the candidate and signatures.
During the post-examination process, some 400,000 carbonless OMR answer sheets were designed, printed and supplied to the destination, along with one duplicate for candidate’s reference. The OMR answer sheets having a unique number were printed in two different colours for Paper I and Paper II.
Processing of OMR answer sheets was done centrewise at KPSC centres by professionals having expertise in projects of similar nature. These OMR machines are capable of scanning around 2,500 answer sheets per hour with an accuracy of 98% without quality checks and 100% through quality checks. Finally, the data was entered in Excel.
The answer sheets had two bar coded portions, one containing the answers and the other containing the registration numbers with some more personal details, whereas the only way to match both parts was through the barcodes. The portions were received separately for scanning to prevent identification of the candidate by the people involved in scanning. Both the data were matched later through the respective barcodes, after getting scanned separately. The database and the software developed enable the user to generate customised reports, based on marks, subject, etc., with the provision of individual mark sheet to be developed from that. The software also enables to convert the Word format document to other required formats such as pdf etc.
Impact and issues addressed
The evaluation of this project is based on outcomes that include impact in terms of number of people affected by the project, impact in terms of likely improvement of the quality of service, impact on the economy or the economic environment in the country, and impact through curbing channels of corruption.
Regarding impact in terms of number of people affected by the project, about 400,000 applicants approached for registering in the examination for the First Division Clerks of Karnataka Public Service Examination in the year 2005. There are several examinations of different nature conducted by the commission every year. If the same technology is applied for all public service commission exams, there is a scope of benefiting millions of people at all different public examinations every year. In terms of impact of likely improvement of the quality of service, the total time for the whole process starting from the acceptance of application to the declaration of results was around of 15-16 months for a number of 1.10 lakh candidates in the year 2000. This came down to less than 4 months for a number of 1.75 lakh candidates in the year 2005, with the application of this simple technology. Also, in the history of KPSC, for the first time, the result was announced in a record time of only 40 days after the exam.
Considering the economic environment in the county, India had a population of about 40 million unemployed in 2004, with an estimate of 32.4 lakh graduates among them. Considering such a large number of unemployed graduates, the functions of public service examinations should be efficient enough to encourage a large number of people to come under its purview, which will help in the best talent pool for the public services. 15-16 months is too long a time for all candidates to wait for a job, which itself eliminates a large talent pool from the system.
Therefore, by reducing the time required for processing the applications, until the declaration of results, would definitely provide scope to larger number of otherwise eligible candidates. It will benefit the economy both in direct and indirect way. Even if the excess supply of examinees over the actual vacancies are not able to fill up the post due to skill mismatches, the extent of the problem could be reduced if higher number of qualified candidates comes under the scope of the exam, creating a direct impact on the economy. Regarding the curbing of channels of corruption, the use of OMR answer sheets with barcodes to be matched from two separate parts automatically takes care of the channels to manipulate the exam score. The exams scores of the candidate being entered against the barcodes number and later being matched with his decoded personal information does not leave any window for identifying the candidate for further manipulation of the exam scores. It also enhances the credibility of the commission.
ICR technology in SSA program
Sarva Shiksha Abhiyan (SSA) aims to provide useful and relevant education to all children of the age group 6-14, as a part of the policy towards the Millennium Development Goals of the Government of India. The project requires current and comprehensive data, including the entire household level information relevant to the issues in education for all children of the same age in the country. The data collected is used for planning, monitoring, starting of Education Guarantee Scheme (EGS) centres, opening of new schools in school-less habitations etc. Access to accurate, complete and validated data on time is the key to any successful interventions leading to further policy implications. Yet similar household surveys conducted earlier suffered from certain shortcomings putting at risk the basis for some policy decisions.
In the earlier years, the data of same nature was collected and consolidated manually at each habitation level and then computerised. The survey forms used for capturing those data were long with the existence of some repeated fields. The forms were simple paper based requiring manual data entry. The earlier process contributed to the problems such as delay in processing; errors in consolidation; for working at the habitation level “Data granularity” was a significant problem; the challenge to ensure data is collected in the same time and space; validation of data very difficult; individual child based intervention was not possible; and, updating the data was very difficult.
During the application of the ICR technology, the size of the forms has been reduced to a more relevant, simple, one page data capture format. As a part of the process, the number of households in each habitation or ward was firstly identified. Finally, a master data, including habitation list was created and validated at Block and District level. As a part of the plan for data usage, data mining tool was deployed. Data updation was done at Gram Panchayat level with the following procedures: the list of children maintained at habitation level as a Village Education Register; the list is updated online using and transmitted to State Data Centre; data is extracted for discussion as well as for corrective and preventive action at “Gram Sabha” meetings; and, key issues are escalated automatically to Special Project Directors.
Regarding the number of people affected by the project, all habitations have been covered with 100% coverage of children. The coverage was such that 10.2 million forms have been scanned and converted to the relevant format for using the ICR technology. The forms have been verified in 15 days, with a speed of over 650,000 forms processed per day. Finally, the entire project was done within 5 weeks and at reasonable cost. In terms of likely improvement of the quality of service, the application of ICR technology assured granularity at individual child level, and high quality data, which is verifiable objectively. Also, the whole process was completed in a record time.
The in-built flexibility in the new process facilitates further use of the same data. It allows tracking and planning at individual child level. It allows dynamic “slice and dice” of data to assist with analysis for informed policy. It allows data sharing with the followings: selective access to data on State Data Centre to all stakeholders; regularly updated reports and views on the portal; custom data extracts for other departments as requi-red; and, transparency and access to up- dated data to public. The impact of any public policy depends on the quality of data used for the study to a large extent. The accessibility to a more recent and accurate date helps to generate larger impact on the people. Therefore, without the help of technological innovations, it poses a bigger challenge to the govern-ment while framing its policies, if not supported by a good quality, timely and relevant data.
In general, as these two cases demonstrate, the reduction of cost and time in processing of forms in public examinations, in announcing results, or in collection of data, the Return on Investment and positive effect on public life is much larger compared to the modest costs of this simple technology and approach. The well-managed application of simple technology assures a more efficient, transparent system, with less channels of corruption. Identification of eligible beneficiaries becomes an easy process with an access to more current and accurate data. Also, the data being uploaded centrally for further use ensures that the whole system works in a much more efficient and transparent way by eliminating wrong identification of beneficiaries.