My couple of weeks in India wasn't quite long enough to see everything this mini-continent has to offer. This summer, I'm back, working for Shelter Associates again with two EWB volunteers, James and Bharat.
Shelter are engaged in an enormous survey project that will involve 30,000 questionnaires, one for each household in hundreds of slums around Pune. On my first day in the office last week, I caught up with Ross Plaster, a British Architect who's been working with the organization for the last two years.
Ross launched into explanations of their current survey process. The forms are detailed, with around 100 questions over 7 pages. Some questions are free-form — names, earnings, cities of origin — but most required coded responses:
26: Existence of Toilet facity:
- 1-Own septic tank/flush Latrine:
- 2-Own dry latrine
- 3-Shared septic tank/Flush Latrine
- 4-Shared dry Latrine
- 5-Community Septic tank/Flush Latrine
- 6-Community dry latrine
- 7-Open Defecation
Social workers navigate the slums on foot to complete the surveys. Large bundles of forms come back to the office, where a few of the employees transpose the responses into Excel Spreadsheets. Dipti is one of them. She has been on some of the surveying missions, and done a lot of data entry too. As she input a form for us to watch, it was clear that the system was taxing her far more than it should have. The paper forms run vertically, with questions down the left side, and answers down the right. Dipti was entering data directly into the rows of a spreadsheet, extendng horizontally. It takes a fair amount of mental effort to constantly transpose data from one orientation to the other.
The spreadsheet also has multiple sheets in order to separate out different aspects of the data for the organization's GIS software. This necessitates a whole lot of flicking back and forth through the form as different parts are destined for different sheets. And there's nothing in place to ensure referential integrity for the data spread across them. I saw a typo that would have orphaned a row while we watched.
Add to this the problem of collating work from multiple inputters. At present, they will each work on a separate spreadsheet and collate using cut and paste at a later date. In short, it's not surprising that Dipti was complaining of headaches.
Now I make web apps for a living, and you can imagine my internal monologue. I can fix this! I can solve your problems! I can take away your headaches! You need a web app! If there's one thing our modern web application frameworks can do, it's storing form inputs in a database.
I was already working out the database schema in my head, and mapping out an input form that corresponded much more closely with the survey sheets. With keyboard shortcuts for the codes, and validation on every step I was sure we could cut input time and errors dramatically. Plus once the data was in the database, we could output it in any format required - for the GIS software or any other purpose.
I pushed the idea a step further. If you've optimized the web-form to model the survey form as closely and efficiently as possible, you might as well just skip the paper surveys and remove a major source of errors in the transposition stage. Why not collect data on laptops or tablet devices? Ross saw the potential immediately. James had already thought the same thoughts. We began discussing the technical feasibility. Were the mobile data connections reliable enough in the city? Ross thought so. Would it be better to store data locally and upload it later? Hey, could we integrate live GPS data into the process?
After ten minutes or so of excited discussion, Ross thought to ask Dipti what she thought about it all. She was incredulous. "Are you crazy? I would have so many kids around me poking that thing, no one would hear my questions. And the questions in the survey are personal. Who would answer me surrounded by everyone like that?"
It was a sharp comedown from our techo-idealist high. We had been trying to invent the best possible system to deal with an abstract problem, rather than the concrete problem facing the interviewers. We were thinking about how best to gather data, not how best to gather data from these particular people in their difficult circumstances. As fun as it is to imagine real-time, geo-linked survey data streaming into our server from the slums, we're stuck with good old pen and paper for now. On the plus side, we can get rid of that spreadsheet, and give them a way of entering data more quickly and accurately.
At Conversocial, we structure our work with pull requests. Each one is supposed to be reviewed by at least one other member of the team before it gets anywhere near the master branch. Adding comments to pull requests is a good way to get a dialogue going that can highlight potential problems with the code. Zach Holman posted a good summary of this working style last year.
Without much happening last weekend, I decided to create something to add a little bit of incentive and competition to our pull request process.
Version 0.1 of this system is tremendously simple. Every 10s it checks the github API to see whether there are any new pull requests, or if any have been updated. If that's the case, it fetches the comments and gives out points based on their content. For simply making a comment on someone else's pull request, a developer receives one point. If the comment contains the ':sparkles:' emoji, then the author of the pull request receives 4 points (as long as the commenter isn't also the author!).
Accumulated points are displayed on an LED matrix hung on the wall of our office. My hope is that this will encourage more feedback in general, as well as a way to give a colleague a little slap on the back for a clever algorithm or a nice bit of attention to detail.
That LED board is a Peggy that I made a couple of years back from a kit. It involved over 1300 solder joints, so I'm glad to see it in use again! It's only 25x25 LEDs though, so I had to economize on space.
The system is quite obviously open to gaming, but gaming attempts will have to take place in the open. Maybe we'll need to adjust the scoring and add more sophisticated measures to make it meaningful. Time will tell whether it has positive effects.
The code is on github, of course.
I've made an Android App which you can pull out in such a situation to get an instant overview of what's going to be good right now. It's called "In Season" and I would be thrilled if you would go to the Android Market, download it and tell me what you think of it.
The code is available, open sourced under the GPL, at github.
Here's a short post written while I'm still on a high from a superb Code Retreat at Eden Development in Winchester. I don't think I've ever been around such a passionate bunch of software people before. It was totally inspiring.
For me, it really felt like a retreat. A retreat from frameworks, from legacy code, from issue tracking, from release management. It was just tests and code. I tried 7 different approaches to the same problem with 7 different people in 6 different languages. At the end of every cycle, we deleted everything and started fresh.
The main point of this post is to give some Google Juice to the people who put the day together: the aforementioned Eden Development, everyone I worked with, Aimee and Despo who organized, and thanks to Enrique Comba Riepenhausen who provided constant challenges and guidance.
Update 2011-06-12: Just noticed this article about the current unrest in Syria. The #syria hashtag is apparently being flooded by government spam. This shows that this problem really exists, whether or not the following offers any real chance for a solution.
Twitter is the opposite of an authoritative source. It's a mighty stream of uncertain information, and trying to take it all in is known as 'drinking from the fire-hose.' Hearsay gets cut and pasted as fact. The most sensational news gets retweeted most often. Before you know it, you've got running gun battles on Oxford Street.
There's been some debate about whether Twitter played any role in bringing down Mubarak in Egypt. It can't be doubted that technology shaped the way we remote observers heard about the situation, but it's a stretch to claim it provided useful information to the people doing the protesting.
Is it possible to take the fire-hose and make it useful for such people? The penetration of Twitter and similar social media is increasing. Smartphones are heading towards ubiquity. Is there a way to make these trends work for an internet-connected protester in a Tahrir-square situation?
I thought it might be worth writing a quick follow up to the Wikipedia Visualization piece. Being able to parse and process all of Wikipedia's articles in a reasonable amount of time opens up fantastic opportunities for data mining and analysis. What's more, it's easy once you know how.
Many Wikipedia articles are tagged with geographic coordinates. Many have references to historic events. Cross referencing these two subsets and plotting them year on year adds up to a dynamic visualization of Wikipedia's view of world history.
Creating and referencing an SQLite database is straightforward in an Android app. The documentation you'll find at the Android Developer site and around the web is more than enough to get up and running rapidly.
But as you can imagine, there are a few nasty pitfalls awaiting behind such a simple API. As I was playing with Android's database classes, I rapidly came across situations that required a deeper appreciation of their stucture and roles.
Michael Brunton Spall gave a great little talk on his debugging methods at ScaleCamp last Friday. He based it around a procedure called Analysis of Competing Hypthoses (ACH). Boiled down a bit and applied to debugging, ACH instructs you to:
- List all possible causes for a given bug. Get others to offer their own hypotheses. Don't throw any away yet.
- List what's already known as evidence for and against the hypotheses.
- Building on step 2, Decide what extra evidence you need to gather to refute each hypothesis.
- Gather it, and eliminate possibilities until you have one left.
I think the most powerful thing about this process is the way it forces us to falsify and eliminate hypotheses, countering the very human instinct to hunt for evidence to support our hunches. Falsification is efficent: it can often be accomplished with a single piece of inconsistent information, lasering through all the other evidence with a satisfying logical zap.
At the Londroid meet-up last Thursday, we had an all-too-brief overview of Android 2.3's new features from Reto Meier, and you could feel the crowd's eagerness to get their hands on it and get developing.
Touch screen phones have given us an important new way of interacting with computers. A great user interface can give the feeling of getting things done effortlessly, with minimal interference between your intentions and your actions.
However, there's one thing that can destroy that feeling of effortlessness, even in an otherwise great UI: lack of responsiveness. If you tap or drag and experience a delay before seeing the result of your action, you'll have to start taking that delay into account in everything you do. The illusion of effortlessness vanishes.
- The main GUI thread, and what not to do with it
- Multi-threaded programming with Executors
I wanted to get somewhere fast with with Android's 2d libraries so I leapt right in with some some JavaDoc trawling and some learning-by-doing. I've written up my couple of hours of experimentation below.
- Declarative layouts
- Drawing on a
- Customising a view's the measurements and positioning
Setting a goal
In a previous project, I made a simple web service which allowed "check-ins" to record progress and would produce data representing a graph of check-ins per period. I thought that hooking this service up to an android app might be a good medium-term project. As a tiny first step I thought I'd try and produce an on-screen graph. This gave me a small, achievable goal to hack towards.