My couple of weeks in India wasn't quite long enough to see everything this mini-continent has to offer. This summer, I'm back, working for Shelter Associates again with two EWB volunteers, James and Bharat.
Shelter are engaged in an enormous survey project that will involve 30,000 questionnaires, one for each household in hundreds of slums around Pune. On my first day in the office last week, I caught up with Ross Plaster, a British Architect who's been working with the organization for the last two years.
Ross launched into explanations of their current survey process. The forms are detailed, with around 100 questions over 7 pages. Some questions are free-form — names, earnings, cities of origin — but most required coded responses:
26: Existence of Toilet facity:
- 1-Own septic tank/flush Latrine:
- 2-Own dry latrine
- 3-Shared septic tank/Flush Latrine
- 4-Shared dry Latrine
- 5-Community Septic tank/Flush Latrine
- 6-Community dry latrine
- 7-Open Defecation
Social workers navigate the slums on foot to complete the surveys. Large bundles of forms come back to the office, where a few of the employees transpose the responses into Excel Spreadsheets. Dipti is one of them. She has been on some of the surveying missions, and done a lot of data entry too. As she input a form for us to watch, it was clear that the system was taxing her far more than it should have. The paper forms run vertically, with questions down the left side, and answers down the right. Dipti was entering data directly into the rows of a spreadsheet, extendng horizontally. It takes a fair amount of mental effort to constantly transpose data from one orientation to the other.
The spreadsheet also has multiple sheets in order to separate out different aspects of the data for the organization's GIS software. This necessitates a whole lot of flicking back and forth through the form as different parts are destined for different sheets. And there's nothing in place to ensure referential integrity for the data spread across them. I saw a typo that would have orphaned a row while we watched.
Add to this the problem of collating work from multiple inputters. At present, they will each work on a separate spreadsheet and collate using cut and paste at a later date. In short, it's not surprising that Dipti was complaining of headaches.
Now I make web apps for a living, and you can imagine my internal monologue. I can fix this! I can solve your problems! I can take away your headaches! You need a web app! If there's one thing our modern web application frameworks can do, it's storing form inputs in a database.
I was already working out the database schema in my head, and mapping out an input form that corresponded much more closely with the survey sheets. With keyboard shortcuts for the codes, and validation on every step I was sure we could cut input time and errors dramatically. Plus once the data was in the database, we could output it in any format required - for the GIS software or any other purpose.
I pushed the idea a step further. If you've optimized the web-form to model the survey form as closely and efficiently as possible, you might as well just skip the paper surveys and remove a major source of errors in the transposition stage. Why not collect data on laptops or tablet devices? Ross saw the potential immediately. James had already thought the same thoughts. We began discussing the technical feasibility. Were the mobile data connections reliable enough in the city? Ross thought so. Would it be better to store data locally and upload it later? Hey, could we integrate live GPS data into the process?
After ten minutes or so of excited discussion, Ross thought to ask Dipti what she thought about it all. She was incredulous. "Are you crazy? I would have so many kids around me poking that thing, no one would hear my questions. And the questions in the survey are personal. Who would answer me surrounded by everyone like that?"
It was a sharp comedown from our techo-idealist high. We had been trying to invent the best possible system to deal with an abstract problem, rather than the concrete problem facing the interviewers. We were thinking about how best to gather data, not how best to gather data from these particular people in their difficult circumstances. As fun as it is to imagine real-time, geo-linked survey data streaming into our server from the slums, we're stuck with good old pen and paper for now. On the plus side, we can get rid of that spreadsheet, and give them a way of entering data more quickly and accurately.
Related tags: data, gis, india, webapps