1. Scary Words I Hear in GovTech: Maintenance

    I hear people in government talk about software maintenance a lot. What they usually mean1 is they have paid for a software application to be built by a contractor and now they want to put it on a shelf and pay someone (possibly a different contractor) to dust it off every now and then. Maybe they’ll even pay someone (very possibly yet another contractor) to be on hand should that application fall off the shelf and break – but only if this happens during business hours (9-12pm and 2-5 ET, Mon-Fri).

    Now, if you’re in the private tech sector, you’re probably confused. That doesn’t sound like product development, you might be thinking, and you’d be right. This isn’t product development, and unfortunately, it’s often not even what you might consider product support.

    When companies in the private sector treat their products like this – as something to be built once and then “maintained” – these products eventually lose all their users. Sometimes companies put legacy products on life support because they intend to deprecate the product and ask their users to move on by a specific sunset date. For the rest of them, however, they lose users because their users find a better alternative that meets their evolving needs.

    The government rarely sunsets products or processes. Let’s be real: we’re talking about an organization that still requires employees to fax HR paperwork.2 Users also rarely have alternatives, except for when private industry comes along and convinces them that they should pay for basic services (e.g. filing taxes) or when nonprofits emerge to help with basic social needs (e.g. protecting against housing discrimination).

    But just because users can’t really leave doesn’t mean they don’t matter. With government products, the users are the public, are the customers, are the tax payers. Government loses money and trust when it doesn’t solve real problems, respond to user needs, and demonstrate value. And spending a lot on software that sits around gathering dust and losing users or causing more frustrated users is rather expensive and doesn’t serve the public.

    So, rather than think of software as something static to be maintained, let’s think of software as a product – in govtech, usually a product whose users are either the public or government users facilitating service delivery. The key product idea to keep in mind is that the technology isn’t the end goal: solving a problem is the end goal.3

    When you’ve built a software application, you don’t measure its success by how often it doesn’t fall over – a.k.a. how well it is “maintained” – but by how well it solves the problem you built it to solve.4 This involves understanding users and their problems, setting goals, and building towards those goals. It involves constantly monitoring and assessing the product and its performance towards those user-centric goals. That involves continuing to stay in touch with and understanding users.

    Example: Applying for a driver’s license

    The problems related to this process vary from state to state, and I’m not going to pretend to be an expert. But I have been in many a DMV (Dept. of Motor Vehicles) and applied for a driver’s license in 3 states, and, as a user myself, I’ve experienced some problems. One big one I’ve faced across all three states is that it takes quite a long time to apply for and receive a driver’s license. Most recently it required multiple trips to the DMV, and at the last one, I filled out a form in person and then watched the person behind the desk type it up for me before having me double check that it was correct.

    If you were the product manager for the DMV,5 you’ve got multiple users to think about (driver’s license applicants, DMV staff, DMV executives, etc), but in this case let’s just focus on the public user, the driver’s license applicant.

    For this problem, you’ve also got multiple metrics you can look at, but for simplicity, you could identify one key metric and set an objective for it: the time it takes an applicant to get a drivers license from start of application to receiving the license. You want to decrease this time.

    You’ve identified the users, problem, and measurable objective, so now you follow human-centered design principles and determine that most of the time is taken during compiling the right documents, filling out forms, and visiting the DMV. So you build an online application product that allows for complete online submission of the paperwork so users can make a trip to the DMV for just their signature and photo. Lo, after user testing and deploying the product, you find that you hit your objective. You conclude that your product successfully addresses the problem for driver’s license applicants applicants.

    But you can’t stop there. And you can’t just think of future work on this product as maintenance in the form of bug fixes, security, and system uptime.

    Users change. They change in demographic makeup and how they use technology. Half of users may use this product on mobile today but next year? Maybe that’s up to 75%. If users change, their problems probably change too. In this case, hell, with the promise of autonomous cars, users are probably going to change how they use cars in the not too distant future – and what will that mean for driver’s license applications and this product you’ve built?

    Technology changes. What worked on Internet Explorer 6 may not work on the latest version of Firefox. What worked on desktop may not work on mobile. What is the most efficient and secure programming language or framework today won’t be in 2-5 years, much less 10 or 20 years from now.

    Policy changes. Requirements might be added or removed from the application process. Policy might even come down that mandates how quickly the DMV must process applications, which might affect how you think of product success.

    Maintaining software isn’t good enough. From the time of design, development, and launch, the software you’ve built is a living part of how you deliver services, and should be constantly evaluated and evolved alongside the evaluation and evolution of your agency’s service delivery as a whole – which should include assessing customer satisfaction, impact, efficiency, and other metrics and goals.

    You might even find that your technology solution no longer helps your agency’s overall goals or solves your customers’ problems – in which case, don’t be afraid to sunset the product, document your learnings, and iterate on a new solution.6

    1 Usually, but not always! There are amazing people in govtech implementing more effective product development approaches. #NotAllGovTechies
    2 True story: I sent my first ever fax last month.
    3 This is usually true in the private sector too, except in cases like the Yo app.
    4 Although, yes, system performance and hitting metrics defined in service-level agreements (SLAs) are both important.
    5 Wouldn't it be cool if DMVs had product managers?
    6 In government and looking for something to sunset? I vote starting with the fax. And sending user passwords by physical mail. And websites that only work in Internet Explorer. See, there are so many things ripe for deprecation!

    "Expect delays" by Tom Woodward is licensed under CC BY-SA 2.0

  2. Where are the Government API Directories?

    I’ve been working on organizing some departmental knowledge at CMS, and our human-centered design team – the crew that promotes design thinking and helps other teams build better, more user-focused products and processes1 – recommended that my first step be to get teams to make directories. It’s a huge task to get everyone to use Confluence or to use it consistently. Instead, make sure each team has one Confluence page that links out to everything someone might need, and keep a central directory linking out to those team’s directory pages.

    Directories aren’t just useful to collect resources in a collaborative team setting. Google found success in being the biggest, baddest directory2 of the World Wide Web in the whole wide world. Its premier product is essentially a list of links to other pages on the internet. The web wouldn’t be what it is today without lists of links, and without people or programs making lists of links.3

    This isn’t a new concept: tables of contents and indices in the backs of books have been providing lists of links for millennia.4 Tables of contents, indices, directories, catalogues, registries, etc – these are all about empowering the user. Do you think you know what people are looking for? You don’t. That’s why you have to empower people to find it themselves and access what they find. In other words: You need to give them a list of links – and maybe a decent search function.5

    In the context of government, it may seem like a no-brainer that a government website should have directories: a directory of all the services offered by that agency, or a directory (a.k.a. sitemap) of all the pages you can find on that website. Indeed, most government websites have these things. In fact, this may seem so obvious you’re probably wondering why I’m writing about it.

    The thing is, when it comes to APIs and specifically government APIs, we don’t see a lot of directories.

    Right now there is no complete (or even partially complete) authoritative list of US federal APIs. Here’s what does exist (that I could find this weekend):

    • 18F used to maintain a list of federal developer hubs, which is a pretty decent substitute for a list of federal APIs. An organization’s developer hub would typically list all of that organization’s APIs; however, this isn’t always the case. Furthermore, the Github repository for the website’s code has been archived, and the last update was made in September, 2018.6

    • Programmable Web’s Government category in their API directory currently contains 772 APIs. However, these include APIs from governments across the world (e.g. Singapore and New Zealand) as far as I can tell, there’s no way to filter these by country. Furthermore, some of these APIs are not published by governments but by private companies or other organizations.

    • The next closest thing I could find is data.gov. Notably, data.gov is the US federal directory for open datasets, and datasets are not the same thing as APIs. APIs are complete software products: they have a full lifecycle from strategy and design, to testing and deployment, to marketing and change management. Plus, they can be transactional, allow you to send data back (a.k.a. “write” APIs), or provide services (e.g. enabling you to submit a FOIA request via API). Some of the datasets linked to from data.gov are available via APIs, in addition to being available as flat file downloads. You can specify “API” in the data directory to find these.7

    I’ve done some research into other government API directories as well, and haven’t come up with a whole lot. Here are a couple:

    • New Zealand has an API catalogue that you can search, and when you select any API, you are directed to a page with both documentation and an API console rendered from the API’s OpenAPI definition.

    • The UK Government Digital Service (GDS) recently started an API catalogue initiative.

    Why bother with government API directories?

    API directories are important because APIs are products. An API directory is a product directory: you can think of it as both an inventory for the business owner and a catalogue of available products for the customer. Governments should have directories of their API products so that they themselves know what they have across different silos (a.k.a. agencies and departments) and can begin to collaborate and share knowledge (and eventually infrastructure), and so that the public can discover and use services and products offered as APIs.

    You know that feeling when you go to an ice cream shop and want mint chocolate chip but it’s not on the menu, so you settle for chocolate chip cookie dough, only to find out after you’re halfway through that the shop had mint chocolate ship all along? Or you find out the shop had extra rich, dairy-free dark chocolate sorbet, which you’d never heard of before but definitely would’ve ordered if you’d seen it on the menu? Yeah, you feel a bit cheated, but in a way where no one wins.

    That’s how I felt when I discovered the NPPES API, CMS’s provider lookup API: I had already seen CMS’s developer portal and thought I knew what was on CMS’s menu, only to discover this other API weeks later in a meeting. Needless to say, the dev portal does not list the NPPES API.8

    Why are API directories so hard?

    People have been trying to build API directories for years. Programmable Web, RapidAPI, and others have API directories of varying levels of freshness and accuracy. It’s hard – especially when you’re relying on humans to create and maintain these directories. There are so many APIs, and they change: they get new versions, or they get deprecated, or their documentation moves to a different URL. It’s a lot to keep track of.

    Wouldn’t it be great if we could automate creating and maintaining API directories somehow?

    Luckily, people have been working on ways to do just that!

    • APIs.json: This project, started by the API Evangelist and 3scale, aims to create a standard, machine-readable way for API providers to describe and share their API operations, similar to how web sites are described using sitemap.xml (which is a pretty standard part of websites now). You can search APIs that are described by apis.json files at the related search engine project: apis.io.

    • JSON Home: Similar to the above, this project aims to provide a standard for machine-readable “homepages” for JSON APIs.

    The goal with both of these standards is that you can figure out what APIs are offered by a given company or organization simply going to the API homepage. That homepage is essentially a directory of the APIs available with links to the documentation of each API.

    How can we get to maintainable, sustainable government API directories?

    Many governments already have API standards and guidelines that they publish and – hopefully – adhere to. I would like to see each of these documents include a requirement that agencies keep an up-to-date machine-readable directory of their API offerings that link to the OpenAPI (or other standardized) definition for each API. One way they could do this is having an APIs.json file life at agency.gov/apis.json – or, they could use JSON Home or other emerging standards for machine-readable API directories.

    The idea is, if you are a central government and you have some agencies publishing APIs, they could list their APIs as data in a machine-readable format on a URL that they’ve given you that doesn’t change, and then you can have a website that grabs that data from those URLs in real-time. Then, you can display the data however you like – maybe you jam all these lists into one list and display the list with pretty styling and let users search by keyword or filter by agency. Then, BAM, you have a centralized API directory that gives a coherent and accurate picture of all the APIs provided across your agencies, and all you had to do was add the agencies’ directory URLs to your website’s list of URLs to get data from.

    Some governments and agencies already require API providers to register their APIs, though this isn’t currently requird to be in machine-readable formats:

    I was excited to read that data.gov is already pushing a similar initiative for open datasets: All agencies by the end of 2020 have to publish catalogues of their datasets at agency.gov/data and each dataset or data API offered by an agency must be described by a standardized data.json file that contains metadata. This makes automated discoverability and maintenance of the federal data directory not only possible, but easy. Let’s work towards making this same principle a reality for government APIs as well.

    1 Interested in how we use human centered design at CMS? Read about it on the USDS blog.
    2 Other superlatives we could add: Creepiest, greediest, megalomaniacalest, etc.
    3 Unfortunately, Google's dominance as The Directory of the Internet is often a big deterrent from people making their own.
    4 According to Wikipedia. Wikipedia is another great example of a product using links to help people find what they’re looking for -- and things they weren't. Alas, the many hours lost to Wikipedia rabbit holes.
    5 Wondering what to get me for my birthday? That’s right - give me a list of links, maybe one with links to cool things to do in DC, because damn is it hard to find non-museum or non-institution events in this city.
    6 Interestingly, it looks like they'd had other ideas for creating a canonical list before, based on this old, never completed Github repo.
    7 But this isn’t really a directory of the sort I’m looking for, and it appears only two departments publish APIs to this catalogue. These APIs are open data APIs and some of these APIs use HTML as a content type, so can these even be called APIs?
    8 To be fair, the NPPES API replaces a data download, so the providers may be thinking of it more as an open data offering than an API product, but still -- this is an API I'd like to use in a real-time manner, and if I hadn't been in the room to hear someone mention it, I would not even have known about it.

    "Table of contents in Benedictus de Nursia: De conservatione sanitatis" by University of Glasgow Library is licensed under CC BY-NC-SA 2.0

  3. The Sound of 2020

    Hello, internet, my old friend, I’ve come to talk with you again. Not about visions softly creeping, leaving seeds while I was sleeping. No, I haven’t been sleeping – I’ve barely had time to, between all the work I was doing in 2019 and the number of times I listened to this Simon and Garfunkel album on repeat.1 It turns out I have a hard time saying no to interesting projects – and I’ve been focused more on doing than on writing about what I’ve been doing. Now, it’s time to reverse that trend, and disturb the sound of silence.2

    2019, phew

    I’ll start with my biggest news of all: In the fall, I moved to Washington, DC, to start a “tour of civic service” with the United States Digital Service (USDS). I’m working with the Centers for Medicare and Medicaid Services to help fix some small part of the healthcare system through thoughtful interoperability and better, more “modern” gov tech. I’ll write more about this later, but in the meantime, check out one of the products I’m working on: Data at the Point of Care.

    But what else was I up to in 2019? In roughly chronological order:

    Early in the year, Programmable Web commissioned me to investigate the technical feasibility of turning away from that neon god we’ve made,3 Facebook, and taking your data with you– and then to figure out what your options are if you do manage to do that. I dove into the delightful, difficult world of the decentralized web and specifically standards and projects around decentralizing social networks.4

    Around that time, the European Commission’s APIs4Dgov project brought together a group of researchers and writers, including myself, to prepare a report making recommendations to EU member states on developing their API strategies. I worked on the technical framework recommendations, and in June contributed to a day-long workshop at APIdays Helsinki for folks in gov tech to share knowledge and best practices for implementing government API products. Report to be published this year.

    In May, we launched our first REST Fest in Poland, and in September we kicked off our tenth year of REST Fest in Greenville, SC. REST Fest is an unconference dedicated to APIs, hypermedia, and web architecture, and I’ve been involved in organizing it for the past few years. The core principle is that everyone talks, everyone listens, and the goal is to help foster a collaborative, non-intimidating environment to discuss and hack on tools for APIs.

    Open Referral, an initiative pushing forward interoperability of health and human services data, has been gaining adoption and over 2019 launched multiple partnerships to implement their Human Services Data Specification (HSDS) and open source tools in different cities. Building on some work I started while mentoring students at HackIllinois (a 2 day college hackathon) in February, I helped code and deploy open source MVPs (minimum viable products) of tools to convert data from unstructured or miscellaneous formats to HSDS-compliant datasets.

    So, 2020

    I’m tempted to start this section with “Ain’t no rest for the wicked” but alas, that’s not within the song scope of this post. The sentiment, however, still applies.

    While I continue full-time at USDS, I’m going to get back into writing here and speaking at conferences. Over my career I’ve done a lot of work adjacent to government – like volunteering with Code for America brigades, consulting for municipalities, and building products for government users – and now that I’m on the inside, I’ll have a lot to share about what I observe and learn.

    Questions I’m thinking about right now:

    • What does “modernization” mean for government technology?
    • How do we build digital public infrastructure that lasts years or decades that is flexible, adaptive, and user-centered?
    • How does open source meet or not meet the needs of gov/civic tech?
    • How can we learn from decentralization to build more effective or more resilient public infrastructure?

    I’d love to hear what questions you have too – reach out on Twitter or respond to my newsletter (sign up in the footer at the bottom of this page) to talk.

    1 Yes, I got this album on vinyl, because I'm that cool.
    2 I wish I could say that this is the last time I use a Simon and Garfunkel song as metaphorical fodder.
    3 Okay, okay, I think this is the last Simon and Garfunkel reference in this post.
  4. The White House API Standards and the Ancestry of Government API Guidelines

    Note: This is somewhat technical and focused on APIs, but may also be of interest to anyone who cares about how conventions and trends for digital government spread.

    I’m in the middle of a research project on government APIs, and as I’ve read more and more examples of API guidelines from governments across the world, it’s struck me how so many of them can trace their roots back to the White House API Standards. Even if I haven’t found evidence of direct lineage from the White House standards to a given API playbook, that playbook usually at least cites the White House repo as a resource or example for further reading.

    Origins of the White House API Standards

    In 2011, President Obama issued an executive order mandating that US federal agencies had to make web APIs. I can imagine that things got chaotic pretty quickly, with competing conventions struggling for dominance like tortoises crawling on top each other for the sunniest spot on the rock, because the following year someone had the good idea of creating a document of API standards for at least the White House APIs to adhere to.

    The White House API Standards repository on GitHub was created on Dec 19, 2012, with a first commit whose message and content was some pretty impressive ASCII art:

    White House ASCII Art

    The committer is Bryan Hirsch, tech lead at New Media Technologies at the White House at the time. I found this sweet slide deck that he and Leigh Heyman, Director of New Media Technologies at the White House, used to explain the thinking behind their new standards, plus an article about and video of the talk.

    The tl;dr is that they created the API standards after working on the “We the People” petition website and related API, in order to make the underlying data concepts easier for non-developers to understand as well as maintain and encourage API best practices over time at the White House and maybe beyond. That beyond definitely happened.

    Before I get into that, I think it’s worth documenting here the influences on the White House API Standards:

    It’s also worth noting that the White House petition project was built in Drupal, and these standards include Drupal-specific resources and one of the main influences is a talk from a Drupal conference. The standards also still include JSONP examples, although that technology is outdated, insecure, and not generally recommended anymore.

    The offspring and third cousins twice removed of the White House API Standards

    White House API Standards GitHub Repo

    To trace the family tree that sprung from the White House API guidelines, I started with the GitHub repository’s forks. There are 654 forks, which means that the repo has been copied and potentially extended or used as a base for other projects 654 times. It has 2,621 stars, meaning that many unique users on Github wanted to register their interest in the project. 215 users are watching it, even though the last activity was 4 years ago.

    The following governments have made direct forks of the repo:

    Admittedly, a fork doesn’t automatically mean use, but in some of these cases the forking organization has clearly used and updated the standards.

    Other government agencies cite these guidelines as their origin, although I can’t find a direct GitHub fork or commit trail:

    • 18F: Source by forking the White House’s repo.
    • Australian Digital Transformation Office: Source
    • Finland: Source
    • El salvador: Source

    The fact that 18F (and subsequently the General Services Administration), the Australian Digital Transformation Office, and the UK Government Digital Service all have API guidelines that either directly originate from the WH standards or were influenced by them is pretty significant, because all three of these organizations have been hugely influential in digital government and government API strategy and implementation.

    And the influence of this repo don’t stop with the public sector. Plenty of private sector and nonprofit organizations such as Code for America brigades have forked the repo or cited it as an example or influence, including Microsoft and IBM Watson

    Most of this was uncovered through browsing and text searching on Github as well as on DuckDuckGo. You could explore this more rigorously with some comparative textual analysis of government API guidelines out there that may not reference the White House repo, but I’m not sure if it’s worth going that far. APIs have gotten more ubiquitous and as more and more governments (and companies) have started implementing API programs, their API conventions have matured and evolved past the White House API standards to include things like design thinking for API product strategy and more detailed recommendations on other aspects of API lifecycle such as security and discovery.

    Despite that, it’s interesting to see how much impact these standards have had. As I continue with my research, I’ll probably be able to further trace lines back to 18F, UK Government Digital Service, and Australia, showing the impact that any single organization can have on this tightknit landscape.

  5. Public Data as Public History – and Future

    “It is a supreme gift to realize that the past is a burden you don’t need to carry with you.”1

    In our current digital world, this advice feels both relevant and out of reach. As tech companies follow your every click, view, like, and search across the web, they build profiles of you and assign you a shadow identity even if you “opt out” of tracking, and they effectively make it impossible for you to let the past go.2

    Not only is it unclear whether you can ever erase this past, but it’s also incredibly difficult to escape it — both within a single product and across the internet via advertisements. For example: A friend recently searched for “swallow” on eBay in order to find back issues of a food magazine of the same name, and after getting results that were pornographic rather than useful, she continued to see recommendations for the sex-related products for days after. She spent hours scouring the internet and finally talking to eBay support for ways to delete her search history or change recommendations, all to find out that the only path forward was to delete her account and make a new one, a process which could take days or weeks. There are numerous product issues here, but to me one of the most shocking things that even within a single product, users do not have the ability to control what’s saved and used about them.

    For another example, take this powerful article from last year in which the author describes how after she suffered a stillbirth, she continued seeing ads targeted at pregnant women. When she reported them as not relevant, she was then shown ads for products for newborns, as though the ad algorithm had assumed that because she was no longer pregnant, she must’ve given birth happily. Facebook responded to this article with instructions on how to opt out of entire ad topics, but that’s just for their platform. How can someone possibly reshape their preferences, history, and identity across the internet when their data is being consumed, analyzed, and used for targeted ads (or other purposes) without their knowledge or consent by companies they may or may not even know about?3

    Cities and the burden of their past

    A lot has been written about the loss of agency and data ownership of individuals on the internet, and there are projects and legislation underway seeking to address these issues. But what does this mean for communities? For cities?

    How does the current state of technology enable or prohibit cities and the people living in them from making their own history, re-making it, owning it, and disowning it?

    Note: I’m focusing on cities here rather than communities or other levels of government, because they are a nice little unit with formal governance and plenty of examples to draw on.

    Obviously, cities are a bit different than individuals. For one, cities are very much built on the past: they survive for centuries if not millennia, and they evolve and are constantly shaped by past decisions as well as the desires and needs of current inhabitants or stakeholders, whether they are locals or live in Silicon Valley. We see the past all around us: physical infrastructure like buildings, streets, and water systems, and cultural infrastructure, like public art, outdoor spaces, and memorials. We also see different versions of or remembrances of the past coexisting, like statues of Martin Luther King, Jr, sharing space with memorials to confederate soldiers or white supremacists.

    And there are many pasts we don’t see: the villages, cities, trade routes, and culture of indigenous peoples erased or displaced by settlers. There are the voices and stories that have historically in this society not been heard or recorded: those of women, indiginous groups, minorities, lower classes, the disabled, and immigrant communities.

    This past of a city not something we can or should easily discard, even if it’s a burden we don’t want to carry with us. It is is important to recognize and to seek to understand, because those past decisions impact the present. The construction of highways through historically black or blue-collar neighborhoods not only displaced communities and ensured the future difficulty of revitalizing those neighborhoods, but also led people and money out of cities and into suburbs, shaping American poverty today.

    But a city’s past is constantly being reshaped: we reshape it when we uncover the untold stories, when we understand the influences shaping our present, when we make new decisions for our city’s present or future.

    The digital history of cities is public data

    With tech, we have the opportunity – or misfortune – of having another medium on and with which to write our cities’ and our communities’ histories.

    We’re writing the digital history of cities in the same way our personal histories are being written for us online: through data. For individuals, digital history is the personal data that accumulates from our digital activity - the data we intentionally input and collect as well as the data collected about us.

    For cities, that digital history is public data, by which I mean data that is generated by the public, though it may not necessarily be publicly accessible. Public data can take a few different forms – and if I’m missing any below, please let me know!

    Surveys and observational analysis

    For ages cities have been using public surveys to collect data to understand the stories of their communities and inform policies, zoning rules, etc. There are known issues with this, such as sample size, self-selection, truthfulness, and replicability.4 People have to opt in to taking the survey, so surveys are missing the voices of people who opt out, and even when taking surveys, people may not answer truthfully or consistently with what they’ve said in the past. Other tactics involve in-person observational analysis, but that’s only useful when not used in isolation, which I am told is unfortunately often the practice.

    Operational data

    Operational data is data the city agencies collect in the process of its daily operations. More governments are starting to understand the power of the data they generate simply by doing their jobs, and the stories they can tell with that data.

    For example, New York City established the Mayor’s Office of Data Analytics (MODA) to start treating its operational data as a true asset that can help the city improve services, address issues, share data across the city, and implement NYC’s open data law. They are starting to tell the stories of this data and the people involved in its creation, such as those of drivers of for-hire-vehicles and their welfare.

    Open data

    Public data can be open data. It can be the data that’s available for citizens and companies and other organizations to download and browse or access with an API key. Not all public data that governments collect is actually – or should be – public in the sense of open and freely accessible. That same ride-hailing data that NYC has used to understand and inform policy was shown at one point to contain personally identifiable information which the public would surely not want to actually be public. The balance of privacy and transparency isn’t a problem that’s been solved, but that shouldn’t keep us from trying and promoting open when possible.

    While I have heard government tech folks lament at the underutilization of open data portals, open data is critical in the effort for cities to own their narrative and be accountable to residents and themselves.

    Social data

    Social data can also be public data. An Australian start-up called Neighboulytics has recognized this and is using social data to help cities understand their communities and inform the city decisions. I saw their Head of Analytics, Gala Camacho Ferrari, give a talk at CSV,conf last month,5 and I’m equally cautious and excited about this.

    This is an example of people in the community creating their own data and cities being able to read and incorporate that into the city’s story and use it to have a voice in shaping the city. I have a few concerns though:

    1. People post on social media or review sites for a different purpose than city planning, and that context needs to be taken into account when trying to glean insights.
    2. Furthermore, those people may not consent to their data being used that way. They are posting publicly, though, so they have at least dubiously consented to public use (whether they understand that or not is a different question).
    3. Not everyone in the city engages with social media in a way that can be accessed and used, so their voices may not be represented.
    4. That data lives on notoriously closed platforms like Facebook, Twitter, and Instagram (owned by Facebook), so we don’t necessarily know what’s being filtered out or pushed to the top, and those algorithms might impact the way data is presented and read.

    Their founder does a good job addressing the some of these concerns in a recent interview, and the last is deeply related to the issue of personal data ownership that we’ve already talked about above. Regardless, I think social data is a valuable piece of the puzzle because it rethinks how cities find and incorporate the voices of their residents.

    311 data

    Technically 311 data is a subset of operational data and in some cases open data, but it’s worth calling out specifically because it is so important. 311 is the service that can be a hotline or other communication mechanism through which residents in a city can report issues or complaints with city services or neighbors, e.g. noise complaints, tenants rights issues, etc. More than 200 cities in the US have 311, though I’m not sure if cities in other countries have equivalent services.6

    I’ve heard NYC government employees call 311 open data the single most important dataset in the city. It is the feedback loop between the city and residents.

    Physical data

    Cities are physical places, and they generate physical data. There’s been a lot of hype in the past few years about “Smart Cities” and the potential of unlocking the value of this data for cities through smart or wifi-connected devices placed around the city. For example, Syracuse recently announced a $32 million project to upgrade its streetlights to have smart controllers, and these new lights will be the foundation for future projects like sensors to collect traffic data.

    At what point does this data collection for public good become privacy-violating surveillance? I don’t want to dive into that too much now, but what I find promising are the emergence of sensor companies like Numina that build devices that do “onboard computing” or “edge processing” – basically meaning that images and other identifying information are processed on the device and sent to the cloud in an anoymized form, and then that identifying information is deleted on the device. To me, this is the only acceptable and responsible way to do public Internet of Things data collection that I know of.

    Geographic data

    Another type of public physical data is geographic data. Also known as map data or geospatial data, this type of data is public because it describes the world that we all share. This may not necessarily include geospatial data describing private property, but it does include data describing streets, parks, locations of public institutions, etc. Cities and governments typically have departments responsible for a geographic information system (GIS) with detailed geographic data of their jurisdiction, though that data has historically been difficult or costly for the public to access.

    Maps are an important part of the public data conversation because they are a “tool of both recognition and oppression.” I dive into this a bit more below, but for some positive news and a historical look at the social impact of maps, check out this article about a new mapping project called LandMark that aims to map and therefore help protect indigenous land.

    Shaping public data with an eye towards equity

    Data itself isn’t objective, and the act of collecting data isn’t enough. The stories we can tell from data are shaped by the way we collect the data, and what data we choose to collect. For example, if we collect data about medical service use and only collect binary gender options (male and female, rather than additional options such as trans-male and trans-female), then we are missing insight into the medical needs of the trans community.

    Taking the same approach with 311 data, if we don’t overlay service request data with socioeconomic, demographic, or location data, we may miss valuable insights into why certain requests seem more prevalent than others. Studies like this one have shown that socio-economic and demographic factors do play a role in who is more likely to make service requests, meaning that we cannot use 311 data on its own to tell a definitive and unbiased story of all city service issues. From a practical perspective, this is important because the city uses this data to determine things like resource allocation and maintenance, and therefore needs to make sure additional data and analsysis are used alongside the raw data to provide context.

    The hand that holds the pen

    As the characters of the recent film, Colette, like to say, the hand that holds the pen writes history. If public data is the history being written, we have to make sure that the public is the one holding the pen (and the paper). We already see the disturbing consequences of individuals not owning their data or rights to their data in the current tech landscape. This has sobering implications for cities and communities that we can’t ignore.

    We’ve already seen multiple instances of communities’s identities being shaped against their knowledge or will because of the power of tech companies like Google in owning and controlling the data that people use. Take for example the recent story about Google erasing a neighborhood and the aftereffects. A community in Buffalo that had referred to itself as the Fruit Belt for generations, suddenly found itself being referred to as “Medical Park” on Google Maps. The source of the name change is complex (read the article - it’s a good one!), but a local geographer and data scientist named Aaron Krolikowski quoted in the article summarizes a key point:

    “We’ve historically tended to self-identify our communities…. If suddenly we become disconnected from that process, I think there’s a lot of questions that emerge about the ability of a community to determine its future, in some cases.”

    The ability for a for-profit company, which is not accountable to the community (except perhaps when there is bad enough press), to issue an entirely new identity to that community without its consent and with clear economic and social consequences on that community’s shape and future, is a demonstrative and alarming example of the wrong person holding the pen.

    We also have to be cognizant of who is making the pen. If the tools being used for civic planning and data collection are built by people who are not representative of the communities in which these tools are being deployed, they will not even be aware of the variations and types of data they need to be able to collect.

    This is why many people are reasonably wary of “Smart Cities” programs, especially Alphabet’s Sidewalk Labs project in Toronto. Alphabet is the parent company of Google, and this project involves huge quantities of data being collected. For this project and all the other tech projects involving public data generation, collection, and analysis, we have to keep asking:

    Who will truly own that data? Who will decide what types of data get collected, and who is collecting the data? Who is making the tools for this data collection? What policy decisions will this data influence, and what stories will be told from it? How will individuals’ privacy be protected? How will cities ensure this data doesn’t get passed to undisclosed companies to further target ads or seek profit or be used against the will of the public? Perhaps most importantly, how will city residents be able to control and shape that data, delete it when they choose to, and use it for their own self-determination?

    1 I saw this quote on a little bronze plaque in a Budapest coffee house called Cafe Madal, though, full disclosure, the wording might have been slightly different. I didn’t take a picture and I haven’t found this in the online archives of Sri Chinmoy, to whom the cafe attributed these words.
    2 You can read more about that here.
    3 For a look at data brokering, but in the location/geospatial industry, check out this report from MIT.
    4 Admittedly my source is a company trying to sell an alternative to surveys.
    5 Also, thanks to Gala for pointing me to some of the articles linked in this post!
    6 Check out this fun timeline for a history on 311.
  6. (Anti-)Trust is Digital Public Infrastructure

    “What does digital infrastructure mean to you?” someone asked me last week on a late night walk through DC.

    We’d just left federal government grounds, where a cross-organizational, tech-in-gov family games night was hosted in the ceremonial Secretary of War suite. I was buzzing from pizza and non-stop conversation about improving government for the American public.

    “APIs,” I said – which, you might already know, is my default answer to any tech question. API stands for Application Programming Interface, and it’s how you exchange data between software systems or servers.

    “I’m thinking at a lower level,” he responded. “To me, it’s NPM (a tool for managing JavaScript libraries), or other libraries we use to build software.”

    In other words, he meant code, open source or otherwise.

    Hardware and the cloud as digital infrastructure

    Code is digital infrastructure, and I’ve already written on why I think public infrastructure code should be open source. But there are other layers of digital infrastructure as well: the lowest level of all, technologically speaking, is hardware. APIs help you get data and value out of a system: they enable new workflows and products and unlock value for other parties. But to have APIs, you have to have software and data that can be exposed and used by others. That software lives as code, and that code has to live somewhere.

    Traditionally, in government and enterprise industries – from finance to healthcare – that “somewhere” was and often still is a locked-down warehouse, basement, or closet, housing one or many servers that can be accessed through secure networks on-site (e.g. an “intranet”) or, when allowed, by external users via the internet.

    Compare that to the “cloud”: The cloud is a bunch of servers that run somewhere else, in a dedicated server farm or data center, and if you want to host your code and data somewhere, you can purchase space in that data center. You no longer have to worry about the physical safety of your servers, like protecting them from natural disasters or making sure they don’t overheat. You also don’t have to worry about scaling: if you need the servers to do more or hold more data, or you need more users to be able to make requests to your servers, you don’t have to buy the new hardware (or physical space) and provision it yourself. You can simply click a button, pay a little more, and voila! You’ve got more server space and capability almost instantly.

    The question of whether governments should self-host software that is public infrastructure or host it in the cloud, is complex. I see two main reasons why:

    1. Self-hosting is extraordinarily expensive, especially with the existing procurement process and government vendor landscape.

    2. The existential threat to democracy that monopolistic private hosting companies pose, especially the elephants in every room: Amazon and Google.

    Governments very often still “self-host,” but what this usually means is pissing away money to an endless number of contractors (including multiple layers of companies who simply re-sell the software or services of other companies) who manage and maybe sometimes own the data centers. It’s an expensive and inefficient byproduct of the bloated, spaghetti-like procurement process. I’m generally trying to be less uncouth (more couth?) but honestly this makes me so angry and the phrase “pissing away” feels right in my soul.

    Governments can save millions or billions of dollars by moving their code to be hosted in the cloud. This would also give better service to the People through more reliable, faster, and sometimes more secure websites that provide public services.

    But, and this is a big but: if hardware is a necessary component of digital public infrastructure, should that hardware be publicly (i.e. government) owned?

    I think the answer is maybe, but it has to be done differently than it is now. Procurement is part of digital infrastructure too, and the existing processes need to be improved if not overhauled completely.

    And if that hardware is not publicly owned, is it okay for government software to be hosted on just one, maybe two, cloud hosting providers?

    The answer to this question is emphatically no.

    This is a critical question to ask in this moment, because one cloud hosting provider is currently beating out all the others and is frequently cited as the best-in-class, de facto hosting platform: Amazon.1 Amazon Web Services (AWS) has over a 35% market share of the cloud,2 and there are only two significant competitors: Microsoft Azure and Google Cloud. An argument could even be made that the bigger a cloud provider is, the cheaper and more efficient its services are, which, some might argue, is better for everyone. Why have more than one big cloud, let alone three big clouds?

    Right now I’m generally for government services to be moved to the cloud, but it cannot be to a single cloud. If all government services were hosted on AWS, this would pose an incredible risk to the People: If Amazon failed, then government might fail.3 And even scarier, if Amazon could influence or turn off government by increasing costs or shutting down services, they could hold government, and therefore the People, hostage.4

    Government cannot rely on a single cloud that it does not own. We need clear guidelines and policy for diversifying the clouds that make up the hardware layer of digital public infrastructure.

    But it’s not just within the public realm that we have to be wary of clouds that are too big to fail or so big and closed that they can exert undue control without oversight. Our economy and society are increasingly run in digital or online spaces, and those spaces, while not physical, are public spaces. The digital infrastructure underlying them needs thoughtful oversight, regulation, and maintenance just as we would expect for roads, parks, and brick-and-mortar businesses. We need a plethora of digital options for hosting our businesses, accessing services, communicating with our social networks, and sharing photos from last week’s Corgi meetup in Central Park, and we need to be able to leave a platform if we don’t like what they’re doing with our data or the rules they impose on the types of software we can host.

    We need policy and anti-trust regulation to protect the People (read: the consumers, the citizens, the residents, the people who just want to get on with their day) from privately held, monopolistic cloud infrastructure.

    On a more technical note, this is why I’m also a proponent of Docker, containerization, and serverless technologies, which make it possible and, ideally, easy to move from one cloud provider to another. That way, even if you end up on AWS or Google Cloud, you can re-deploy your code to a different provider, or your own servers, in days or even hours if you need to. If these words don’t mean anything to you, just remember that portability of code and data is critical if we’re going to use cloud providers.

    I’m also super excited about distributed and decentralized technologies to help solve this problem, which I’ll write about later.

    Trust is digital infrastructure

    So far I’ve talked about how hardware, the cloud, procurement, and anti-trust regulation are key components of digital (public) infrastructure. But underlying all public infrastructure, digital or otherwise, is trust.

    We trust that restaurants are being reviewed by the Department of Health to make sure they’re sanitary and safe, and we trust that, barring some cases of discrimination and minor corruption, these reviews are honest and in the best interest of the public. We trust that the bridge we drive over to get to work is being maintained and audited for safety on a regular basis by dependable civil servants (or contractors being managed by civil servants), so that it won’t collapse while we’re on it. We trust – maybe – that when we enter our social security number into a government website, that that number and accompanying sensitive information about us is safe from hackers.

    It’s worrying to me that I have to insert the “maybe.” Government technology is so far behind private sector technology, from user, product, and tech perspectives, that it makes sense why people trust private companies more when it comes to technological sophistication and security. Tech companies got into people’s hands and onto people’s screens first. It makes sense to be a little cautious, or skeptical, but we should also have that skepticism when we interact with private companies’ tech too.

    The key difference between private companies and government that somehow seems to be forgotten is that, in a democracy or republic at least, the People own the government and can influence and change how it’s run. When we don’t think gov tech is up to the task, we can vote for politicians and legislation to change that and we can meet with or become civil servants who tackle those problems. When we lose faith in Facebook or Google, we are powerless to change those companies, especially if/when there are no other options for us to turn to to conduct business or online social activity.

    It’s therefore also worrying to me when governments choose to trust private companies rather than build trust directly with citizens; for example, when a government website asks you to sign in using your Facebook login. While whoever made the decision to have that authentication feature probably had good intentions (such as attract a younger demographic or make it easier for users by not adding to account credentials they have to remember), this is a failure of gov tech because the government is abdicating the privilege and responsibility of trust. It outsources identity management, which is surely a key function of government, an indicator of authority, and a requirement for trust in any transaction, to a private-sector company. Not only a private-sector company, but Facebook, the company that is increasingly perceived (rightfully, in my opinion) as not only creepy, but unethical and certainly untrustworthy.

    When we build civic or gov tech, we cannot give up trust. We cannot build tools or companies that ask the People to trust those tools and companies over or instead of the government. As democratic institutions, we have to actively build trust, ask for it, and earn it. It’s the most critical piece of infrastructure, and we cannot lose it to private companies instead.

    1 For some examples of Amazon's cloud reach even four years ago, see this Atlantic article
    2 You can read more here about the research behind that number.
    3 And we’ve already seen the pain caused by political goverment shutdowns.
    4 One could argue that vendors currently hold the government hostage through the procurement system, but I’m not going to dive into that right now.

    Post header image "DC2"by Tim Dorr is licensed under CC BY-SA 2.0

  7. 5 Questions You Should Ask (and Answer) Before You Start Your Civic Tech Project

    I’m fortunate to be surrounded by people who want to do good in the world. “Civic tech” is – perhaps obviously – full of such people, but so is tech generally: many people building tech genuinely believe that their product helps improves people’s lives. And yes, the Todoist app does help me organize my to-dos more easily, and I have heard busy parents laud food delivery apps which take the major burden of meal management off of their plates.1

    Then there is tech explicitly geared toward “social good”: these are usually companies that have a mission to reduce inequality or increase safety or security measures for a given community such as access to food or housing. These are companies that believe they can be sustainable in – which is code for derive profit from – the pursuit of helping society, usually vulnerable or typically underserved segments of society.

    I’ve worked for such companies – back when phrases like “social entrepreneurship” were cool and even more recently – and participated in Code for America brigades filled with people who wanted to work at or start such companies. I’ve had to hold up the mirror and ask hard questions about myself and about what we were really doing:

    Can you truly be motivated by what’s good for a community while being motivated by profit? Will profit always win out? Do you know enough about the problem to help build viable solutions? Can you truly achieve societal change without changing the system itself?

    This latter question is a huge topic that usually boils down to the debate between gradualism versus revolution, which I don’t want to get into now. Check out Jessica McKenzie’s blog post for a great discussion on when civic tech can be bad, illustrated by different gradualist vs. revolutionary, for-profit and not-for-profit approaches to the US welfare system. Her concluding proposition is one that I think we should be using as a value measure for all civic or social good tech:

    Civic tech should strive to empower the powerless—not as a byproduct, but as a foundational premise. If it shifts power away from the powerful, so much the better.2

    So, how do we use this measure – how much did we empower the powerless and how much did we shift power from the powerful – when critiquing civic tech projects?3 How do we help people embarking on these projects, who are often from privileged backgrounds or do not have lived experience of the problems they want to tackle – use this as a guiding principle from the outset, before they ever lay hand to keyboard?

    There’s some great writing on this topic, and in my opinion we really need more of a revolutionary approach to most problems. However, it may be the gradualist in me that recognizes that right now, people who want to do social good in the world are starting their own projects and often their own companies, and many of them won’t know how or want to tackle real systemic change.

    The following are questions I’ve started to use to break this down for myself when I consider joining a civic tech endeavor, as well as for well-meaning people when we talk about their ideas to help others.

    I’ve even attempted my first flowchart ever:4

    Is your civic tech project actually civic tech?

    1. Is this a problem?

    Or is this a symptom of a bigger problem? Or neither? Is the problem that there is no way to apply for affordable housing online in your city, or is the problem that there is a 10 year waiting list for affordable housing for seniors, or that there simply aren’t enough affordable units? Or, that our approach to affordable housing needs more holistic reform to address systemic race and class oppression?

    2. Is this your problem?

    If you’re not part of the community you’re purportedly trying to help, stop and consider whether you’re suffering from the “white savior complex” (even if you’re not white). That’s not to say you can’t help, but if you’re in this situation, the most important thing you can do in your attempt to help is listen. The next most important thing to do is learn as much as possible about the status quo and how it got here, and keep an open mind.

    This question extends not only to you personally but to your founding team. Does anyone in this team have meaningful, lived experience of the problem? It’s critical that the people who will hopefully benefit from your solution have a voice in the solution (through user feedback or being on the product team), and ideally, that they actually have a seat at the table.

    3. Will you profit from this endeavor?

    This is primarily relevant if the answer to #2 is No. Profit isn’t necessarily exclusive from civic tech, but it is if you are trying to profit from an already vulnerable community and will not share those profits with that community. For real change and empowerment, the community being served by the solution and driving any profit for the owners of solution should be the ones deriving that value and therefore that profit.5 When that’s not the case, it is literally exploitation.

    4. Is the community you’re trying to help powerless in the status quo?

    It’s very possible that you are part of the community you’re trying to help but that that community isn’t the one who needs help. For example, if you believe your problem is that the school board doesn’t know what parents want, and you want to build an app so that parents like you in your neighborhood can be more vocal to the school board, you should ask, who are these parents?6 Are they middle or upper class white folks? Do they already have outlets for voicing their opinions or exerting power and influence? If you believe this is an app for all parents, ask who would even be likely to use such an app and who might take up the most space on it. Are there parents in different neighborhoods or from different demographics who might be adversely affected by such a product? Would the parents in your neighborhood be vocal about policies that would hurt parents (and students) in another neighborhood, not out of malice but simply through lack of representation and space?

    Ultimately, this means that you have to know who all the stakeholders are. You can’t look at a problem in a vacuum. You have to seek to understand why the status quo exists and who currently benefits from it – because someone always benefits.

    5. Does your solution shift power to the powerless?

    Once you understand the stakeholders and the factors at play, you can start to ask whether your solution or project idea will actually change the status quo, and whether it will change the status quo to empower the powerless. If it doesn’t, go back to the drawing board and the community you’re in, learn more about social work and grassroots activism, and be humble enough to recognize when you may not have a good solution. This is the hardest thing for me and I’m guessing for most people: you believe so much in your idea – and you want to help so much – that it’s hard to acknowledge when it won’t have the impact you want it to.

    I’m not saying all this to be discouraging. We need more people caring about and thinking about these problems, and we need people with the energy, drive, and skills to help. But, we don’t need many new ideas.7 We don’t need people trying to solve problems on their own without deep thought and research about the problem and without hard consideration of their own biases. We don’t need tech people with buzz words, or people coming into cities telling civil servants that they need design thinking. We don’t need people riding in like knights in shiny user-centered armor.8

    So, I hope these questions are helpful for anyone thinking about how they can get involved or start civic tech (or social good) projects. Listen, keep listening, and don’t profit from the vulnerable. Make your goal be changing the status quo to empower the powerless – whether in big or gradualistic ways – and keep measuring your impact by that as you go.

    1 Ha, ha! It's been a month since I've posted but I haven't lost my pun game!
    2 McKenzie, Jessica. https://civichall.org/civicist/good-tech-bad-tech/
    3 It’s hard to talk about this because I don’t want to sound discouraging. As Sara Watson writes, it’s hard to do tech criticism at all, much less civic tech criticism, because the critic is immediately branded as anti-technology, a luddite, or, to put it bluntly, an idiot. When you do civic tech criticism, you’re seen as unsupportive, even anti public good, and potentially anti-capitalist (which is a hard sell in the US)-- and therefore naive. We need criticism though, to improve the work that thousands of people across the country and many more across the world are doing.
    4 You can interchange "social good" with "civic tech" in this diagram and probably in this article as a whole. I'm trying to stay focused here though!
    5 In this context of "driving" and "deriving" profit, it's amazing how much of a difference once letter can make.
    6 Also, ask how they’re already hearing from parents, what factors come into play when they make decisions, etc. Going back to #1, this may not actually be a problem.
    7 Harrell, Cyd. https://medium.com/@cydharrell/civic-tech-as-a-tween-4cd780b971bb
    8 I don't have time to dive into this here, but check this out: Iskander, Natasha. https://hbr.org/2018/09/design-thinking-is-fundamentally-conservative-and-preserves-the-status-quo
  8. Measuring the Impact of Open Source Civic Tech, Part 1: The Hypothesis

    Since my last post, I’ve been obsessed with the idea of measuring impact. How do we know that doing any of this helps, and how do we make it more valuable? This topic has more facets than my neighborhood has feral cats, even if we’re scoping this to just civic tech. Given that open source software (OSS) is – and should be – such a major part of civic tech, I want to start there. How can we measure the health of the OSS component of civic tech projects and can that tell us anything of value about the impact of a given civic tech project or the overall movement?

    In this post, I’ll cover how people are currently thinking about civic tech impact, how other people are currently measuring OSS health and impact metrics, and how we might be able to approach looking at the intersection of those two things in the context of open source civic tech. This is just the first post of a series in which I do boatloads of research, data collection, probably some coding, and ultimately analysis on this intersection.

    My hypothesis driving this research: by applying OSS health metrics to civic tech projects published as OSS online, we will see that the most healthy and longest living projects are reusable infrastructure tools or components rather than community-specific projects, and that community-specific OSS projects have healthy metrics only when they’ve been adopted by a government or nonprofit entity.

    Measuring Civic Tech Impact

    There’s been lots of conversation over the past year about the success of the open data and civic tech movements – and lack thereof. The word “success” suggests that there were goals from the beginning that the movements are measured against, but I’m not entirely sure that’s true. There was vision, undoubtedly, but I haven’t found evidence yet that anyone set forth quantifiable measures of success 10 years ago that could be tracked through today.

    Therefore, let’s talk about “impact” instead of “success.” Impact can be had even when success is undefined. Even then, impact is hard to measure. David Eaves at the Harvard Kennedy School recently wrote some of his observations on often unrecognized wins of the open data movement, but still notes the difficulty in truly understanding all the impacts:

    Identifying and collecting [aggregate impacts] into something that is coherent and recognisable as public value is frustratingly difficult. Open data advocates are left with the Sisyphean task of chronicling disparate successes.1

    In civic tech as well, the conversation around impact tends to focus on stories and individual projects. To some extent this makes sense: the communities trying to use open data and civic tech are all different with diverse needs, and impact in one community may look different than in another. Before we can identify how to apply impact measurement methodology across all projects, we should first figure out how to quantifiably measure the impact of individual projects themselves.

    This is where it gets messy. Community groups and even larger, formal nonprofits in this space haven’t quite figured out how to measure outcomes. Grace O’Hara at Code for Australia recently wrote about the lack of and need for long-term impact research, and the importance of capturing measures like sustainability and inclusion in addition to “traditional measures of technological success: user numbers, reach, impressions and spread.” Likewise, Matt Stempeck has bemoaned 10 problems with impact measurement, including “We’re all using different metrics,” “Sharing is irregular,” “Most projects don’t reach most people”, and “We don’t evaluate relative to the macro environment.”2

    Take the the annual Code for America Impact Report as an example. This report highlights the work of distinct projects and partnerships and uses metrics specific to those examples to show impact. Another example is this research article published by TransparenCEE, an organization that works towards government transparency and accountability using tech in Central and Eastern Europe: it too showcases specific examples, which the authors gathered from interviews with six civic tech organizations.

    These reports show the importance of measuring impact within a given problem space and community, and they also show that success is often measured in terms of the civic problem the project is trying to solve.

    What isn’t measured? Desipite TransparenCEE’s finding that sustainability is an ongoing issue with civic tech success, I don’t see that being consistently measured or reported on. I also haven’t found measurement of of the success or impact of the technology component of a given project, or the project’s impact on other communities.

    In a separate article, TranspranCEE proposes that we look at impact not just within the community the project was built for, but also it’s outward effect: “The main question we should all ask ourselves is how many communities did we manage to inspire to take action based on our project?”

    We should ask not only how many communities did we inspire, but also how many communities did we empower to take action based on our project?

    This to me is the real opportunity for the tech aspect of civic tech, and the reason we should look at the impact and health of the tech used in civic tech projects. Tech projects that provide infrastructure or tools that can be applied to other projects are incredibly important to civic tech, and their existence as open source software is necessary to their reusability and thus their impact.

    Measuring Open Source Software Health (and Impact)

    If civic tech is a tween (or an unruly teenager, as O’Hara posited), then open source is its 20-something older sibling who experimented a ton in college, graduated, and now, after a couple of fun start-up jobs, is looking to find the meaning of life – and stability. It suddenly cares about its health, wears a FitBit, even goes to the doctor once a year, and wants to become a lasting part of the world.

    In this analogy, the FitBit is the Community Health Analytics Open Source Software, known as CHAOSS. There are other tools and metrics, such as Netflix’s OSSTracker or PayPal’s Gander, but CHAOSS is the big one run by the Linux Foundation and includes both methodology and tooling. It also has working groups, pleasant diagrams, and, naturally, open source projects to help you run your own analysis and make sense of the findings.

    Big companies use and build OSS as major parts of their business, and they care about measuring the impact of this work. Facebook publishes a yearly open source report, and Google intermittently publishes one as well. Companies and non-profits alike are interested in understanding the impact that OSS has on their business (like efficiency, scalability, and bottom line, but also things like recruitment and marketing) as well as on the larger ecosystem. Check out the Linux Foundation’s detailed guide on approaches to measuring open source program success.

    Some of the metrics people collect are qualitative or from surveys, but many are from the OSS projects themselves as they exist on code hosting platforms like Github or Gitlab. A full list of such metrics that CHAOSS has identified lives here, but I’ve pulled out some of the ones I suspect will be interesting to observe while studying civic tech OSS:

    • Age of Community: Time since repository/organization was registered; or time since first release
    • All Licenses: List of licenses
    • Average Issue Resolution Time: The average amount of time it takes for issues to be closed.
    • Blogposts: Number of blogposts that mention the project.
    • Bus Factor: The number of developers/organizations it would need to lose to destroy its progress.
    • Community Activity: Contribution Frequency. Contribution = commit, issue, comment, etc).
    • Contributor Demographics: Gender, age, location, education, and skills.
    • Decision Distribution: Central vs. distributed decision making. Governance model, scalability of community.
    • Followers: Number of followers.
    • Forks: Number of forks.
    • Installs: Number of software installations of the project.
    • Open Issues New Contributors: What is the number of persons opening an issue for the first time?

    On with the Research

    Can these OSS health metrics be indicators of the impact of the tech part of civic tech? Can these indicators help us build more impactful, reusable, and scalable open source software? What governance or funding scenarios lead to “healthier” open source tech? Can “healthier” open source tech have positive impact on the outcomes of individual civic tech projects? Which metrics, if any, should we focus our efforts on to make sure our civic tech projects have impact in our communities and beyond?

    These are the questions I want to explore with my research. I’ll be using GrimoireLab to collect the data, and I’ll post the data in an accessible way when I have it. Please reach out if you have any data or feedback to share!

    1 Eaves, David, https://apolitical.co/solution_article/the-first-decade-of-open-data-has-been-a-win-but-not-for-the-reasons-you-think/
    2 You can find a rebuttal of his article here: https://civichall.org/civicist/10opportunities-for-impact-measurement-in-civic-tech/
  9. Public vs. Community Ownership in the Age of Open Source Civic Tech

    In my last post, I said that services and service delivery infrastructure which are necessary for human rights need to be publicly owned. In that same post, I gave an example of a nonprofit entity and a community-owned open standards project that have the opportunity to be publicly owned. I realized then that I wasn’t quite sure about the difference between public and community ownership, and whether one was better than the other.

    I’ve always played sports, and hey, I was raised in a capitalistic society, so the words I initially reached for were the competitive “better” and “versus,” but as with most things, the question isn’t about what’s better. Both are real and necessary parts of how our society works, and the question is about their relationship with each other. Furthermore, how is that relationship changing due to open source software and the civic tech movement?

    What’s the difference between publicly owned and community-owned?

    Publicly owned and community-owned are often used interchangeably. Community-owned and nonprofit are too, probably even more so. But publicly owned is not the same thing as community-owned, and community-owned is not a synonym of nonprofit or community-based. However, the differences aren’t cut-and-dried, and I think that trying to define and understand them is important for advancing public infrastructure, be it publicly or community owned.

    Publicly owned infrastructure

    Publicly owned infrastructure is infrastructure that is primarily funded by taxes or a government agency, and whose governance is owned by a government agency. Examples:

    • Streets
    • 311
    • Policies, law
    • Regulation of private industry
    • Open data and city developer portals like NYC’s developer portal

    If you’re in the US or another country with a functioning government,1 you’ve experienced publicly owned infrastructure. It’s roads and sanitation and public school buildings. When it’s not physical, it’s regulation, policy, people, and funding systems that uphold human rights, provide a framework for order and safety, and in many cases, make our lives as residents and as humans better.2 Sometimes private companies own and run infrastructure: utilities like energy and telecom are classic examples. In these cases public infrastructure still exists, largely in the form of regulation to ensure that the companies in question, which usually have a geographic monopoly, can’t be too greedy or too incompetent at the expense of residents’ rights.

    When it comes to publicly owned digital infrastructure, things resemble the Wild West. The groundwork for the internet was laid by government and by international government partnership, and since then tech industry has exploded but public infrastructure has not kept up. People are starting to realize this, and now we’re seeing policy like GDPR in Europe and the internal transformation of government through digital services agencies that are trying to bring tech talent and expertise into the government tech development and procurement processes.

    We still have a long way to go, both in terms of policy and digital infrastructure (e.g. software used by government bodies, open government APIs, etc). In the meantime, and for over a decade, community-owned infrastructure initiatives have risen to fill this gap.

    Community-owned infrastructure

    Community-owned infrastructure is infrastructure that is not solely funded by taxes or a government agency, and whose governance is not owned by a government agency. Furthermore, community members who use or benefit from the infrastructure are involved in its governance. Examples:

    Honestly, it was difficult defining this and finding examples, not because there aren’t lots of great community initiatives, but because it’s hard to say which are truly community-owned.

    A critical part of community ownership is that the community actually owns the thing in question. This is not the case with many nonprofits or community-based organizations. From international NGOs like CERN to small nonprofit-run programs like 2-1-1, the third sector has been involved in infrastructure projects for years, sometimes decades, but structured organizations like these can be or seem exclusive. The community at large often has no real way to participate in the projects themselves, much less in the governance of those projects. The boards of directors of nonprofits are filled with the wealthy (and often passionate!), not with those with lived experience of the community the nonprofit seeks to serve.3 Nonprofits and the infrastructure they run, therefore, can still be valuable and good, but they are not community-owned.

    Still, the distinction can be fuzzy. Take NYC Mesh for example: this group is building a community owned internet network to free people from the expensive and privacy-disregarding telecom agencies and to uphold what they see as the human right to communication. While they’re technically a project of the nonprofit Internet Society, I still consider the project to be community-owned because community members actually own the physical infrastructure that the mesh is built on, and because the governance of the project appears to be inclusive of that community.4

    Where community-owned meets publicly owned

    Now, you may be thinking that this whole “community-owned” idea, where the community members themselves govern the infrastructure, sounds a lot like government, particularly democratic government. You might say that “publicly-owned” means community ownership through government, and in democracies citizens have direct ownership in government through elections. You could even say government is us, with more formalized systems.

    Unfortunately, like with nonprofits, the government doesn’t seem to be us. It seems inaccessible to, detached from, and sometimes even at odds with our community.5 This is especially true in the US, where voter turnout during presidential election years never goes over 70% and during midterm years has yet to reach 50%.

    As a result, people have been looking for ways to take ownership in their communities, alongside, instead of, or in spite of government.

    Community ownership through civic tech

    The origins of the civic tech movement – at least as documented on the web – are somewhat murky: the earliest formal civic tech org according to Wikipedia was in Ukraine in 1991, but the movement really started to pick up steam in the 2000’s.6 In the US, a national nonprofit called Code for America launched in 2009, and their mission is to make “government work for the people, by the people, in the digital age.” Around the world, similar organizations have popped up, like Code for Australia, and many of them focus on improving government through citizen engagement in building infrastructure.

    Despite that focus on government, in my experience the local initiatives that followed often had very little or even nothing to do with government. Brigades – the name for local chapters of Code for America – have a good degree of autonomy and are locally run, and every community has a different relationship with its government.7 At Code for Denver, for example, we often partnered with nonprofit initiatives like Fresh Food Connect or the Rocky Mountain Microfinance Institute because the organizers felt this was one of the most effective approaches to helping the community and also engaging community members. Independent groups like Progressive HackNight have also emerged, and these groups as well as brigades also offer attendees the chance to pitch their own projects.

    I could see only two requirements any of these groups have for projects:

    1. Your project must be for the public good.
    2. Your project must be open source.

    Open source as community-owned infrastructure

    While the civic tech movement was taking off, so was the open source movement. While open source technology existed in the 1990’s and before,8 providers of free hosting for open source code like SourceForge (launched 1999) and Github (launched 2008) paved the way for open source to be successful and widely adopted.

    Open source is infrastructure because it provides a methodology for code to be shared, collaborated on, and built on top of. Open source is community-owned because anyone can participate in a project by contributing code, comments, or questions. This is especially the case on a platform like Github, which has features for conversation about code, including reporting issues.

    Governance for open source projects is a huge topic that I want to dive more into later, but because anyone can see and contribute to code and voice their opinions on decisions about code development, governance is at least transparent and typically has avenues for community members to participate. If you don’t like the way an open source project is being governed – or it’s a dead project that no longer has a group of maintainers approving contributions – then you can simply copy the code and start your own project.

    There are problems with open source, such as inclusivity in participation and in code itself. Frankly, it needs to be more inclusive to be truly community-owned in practice rather than just theory. Regardless of these issues, open source software is a key manifestation of community-owned infrastructure that powers so much of technology, and by extension, our society.

    Transforming publicly owned into community-owned

    I don’t think it’s purely coincidence that Code for America started just a year after Github launched its platform, which enabled not only open source code hosting but also better collaboration on and engagement with open source projects.9 The first Code for America Github repository was created in October of 2010, and now the organization has 682, with many more than that existing under brigades’ Github organizations. I’m working on a deeper analysis of Github use and open source sustainability models in civic tech, but even without that being finished, I’m not sure if the civic tech movement could’ve taken off so much if there hadn’t been a tool like Github, and I’m confident that it definitely couldn’t have worked without open source as its bedrock.

    The greatest impact of these open source civic tech projects isn’t the projects themselves. Those often don’t actually last very long: of the 682 open source Code for America repos on Github, 450 haven’t been updated in over 2 years, and 576 haven’t had code pushed to them in over 2 years. I’ll dive more into this later, but the point is that these projects in the form of Github repos maintained by volunteer groups aren’t what’s going to change the world. It’s the practice of making and collaborating on these projects, the education of individuals about their community and of government about open source and modern best technology practices, and the increased engagement of all parties with each other that will change the world.

    To put it frankly, it’s the doing that matters.

    We’re already seeing incredible changes to government to become more participatory.10 Take Washington, D.C., which publishes all of its laws on Github. That made it possible for GovTrack.us founder Joshua Tauberer to change a law in classic open source style: by submitting a contribution in the form of a Github pull request.

    For an example of more radical transformation, take vTaiwan. The “v” stands for virtual, and the goal of this new system for government is to increase the public’s participation in policy through technology and practices largely modeled on open source collaboration. Through vTaiwan, citizens engage in policy and legislation discussion from the comfort of their homes in a structured and surprisingly unchaotic way, scholars and public officials respond transparently, meetings about the policy are broadcast online, and outcomes have to be tied to the public discourse. Check out this post from Liz Barry describing the process and evolution of vTaiwan in more detail.

    There are so many more examples of progress being made in open and participatory government, with so much due to both the open source and civic tech movements, especially those two working in tandem. Open source software created an infrastructure model for civic tech and by extension government tech that is making publicly owned infrastructure more collaborative, transparent, and truly community-owned.

    1 The jokes are just too easy here – I’m going to resist.
    2 Italy's first Digital Commissioner recently said that governments are here to make our lives better, but IMO they’re not: governments are here to uphold rights and anything else is a bonus.
    3 Curious what boards do? Check out this handy doc.
    4 It's hard to say for sure though about their governance -- everyone must agree to the Network Commons License and there are meetups for people to come and discuss, but otherwise there’s no information on the formal governance structure.
    6 I think this history warrants more investigation and ana;ysis, but I’ve already gotten too bogged down in research this week.
    7 I’ve posted previously about how in my early experiences of civic tech, the tech part was less important than the civic education I received. Wherever I joined a civic hacking group or a local brigade in the Code for America network, I learned about how that community functioned before I could learn how I could help it function better.
    8 Linux, the poster child of open source, was released in 1991. The term "open source software" wasn't coined until 1998.
    9 I'm not saying they planned it or there was necessarily direct causality, more that it was all part of the same Zeitgeist.
    10 "Participatory government" (or variations thereof) is a major buzzword in civic circles these days.
  10. What do human rights, open standards, and venture capital have to do with public infrastructure?

    What so much of the conversation around civic tech boils down to is the question of public/private partnerships. What is the role of companies, specifically tech companies, in our communities, and what is the role of government? And, assuming we will always have both,1 how should they work together for public good?

    I’m not going to wax lyrical on all the many economic, poltiical, and moral facets of this question, but I did recently spend three years in a position that put me face to face with this question on a daily basis. This is some of what I’ve learned.

    The tl;dr: Services that are necessary to protect and enable human rights, and the infrastructure to deliver those services, should be publicly owned.

    The Three Sectors

    Many of you reading this probably work in the private sector. “Private sector” is a fancy term that basically means for-profit companies. Just to make sure we’re all using the same lingo, there are three sectors:

    1. Public: These are organizations or institutions owned by the public. This sector often goes by the colloquial term “government.” I’m putting this one first because it’s the most important.
    2. Private: These are owned by private individuals or fang-toothed venture capital funds.2 While in the US people often use “private sector” to include privately held non-profits, I think it’s clearer to think of private as for-profit, and that’s how I’m going to use it in this post. If a for-profit company is publicly traded, technically members of the public can own it, but you must have the qualification of money, not humanity, to do so.
    3. Third: While I see this term mostly used outside of the US, I think it’s a good way of describing nonprofits or non-governmental organizations (NGOs).3 They’re the ones always taken for granted: they are supposed to be motivated by mission, not moolah,4 and I’ve heard self-described “libertarians” cite them as the people who will pick up the pieces of society out of philanthropic kindness in place of government. Indeed, in many places, they often already do this.

    When I was young and bright-eyed and just getting into tech, I didn’t know much about the Three Sectors.5 After working at nonprofits and getting more involved in civic tech, I learned even more about it, especially the relationship between the public and third sectors. It wasn’t until the past few years that I experienced first-hand how the private sector does, can, and should play into this.

    The productization of social services delivery

    This case study is about social services with a focus on Healthify because I recently spent three intense, often very fulfilling years in that space with that company.6 However, in this section heading you could easily replace “social services” with any other public service or function of government, and you’ll probably be able to find examples of this happening in that area.

    Healthify is a for-profit software and services company whose mission is to “build a world where no one’s health is hindered by their need.” They want to do this by building community health infrastructure (systems, technology, relationships) to connect underserved populations with the social services they need to thrive and ultimately improve health outcomes. Tangibly, their long-term goal is to flip the ratio of spending on healthcare vs social services in the US based on percentage of GDP.

    OECD Chart of Gov Healthcare Spending

    As you can see, the US spends proportionally much more on healthcare than on social services, unlike comparable countries. Healthify believes that doing the opposite will reduce spending overall and produce better outcomes for people. They’re out to prove that and to make it happen.

    There’s a lot that goes into this – including need identification and referral coordination software and client services that help health systems build networks with community-based organizations – but it all started with data. Data about social services.

    The social services data

    Healthify’s product started as a search database for social services. Most (if not all) of the other vendors out there have something similar. There are three notable things about this data:

    This is local data. A single large national call center isn’t very useful in collecting and maintaining this data, because the people curating the data need to be well-versed in local issues. The housing issue and housing-related services in San Francisco, for example, are way different than their counterparts in Ann Arbor – and the data reflects that.

    This is public data. You’re probably paying for the maintenance of this data in some way via tax-funded grants to 2-1-1s (more on them below) or nonprofits, and even if you’re not, you’re certainly paying for the actual upkeep of some of these services, and those services are the creators of the data in the first place.

    This data is necessary to uphold human rights. The Universal Declaration of Human Rights decrees that

    • “Everyone has the right of equal access to public service in [their] country” and
    • “Everyone, as a member of society, has the right to social security and is entitled to realization…of the economic, social and cultural rights indispensable for [their] dignity and the free development of [their] personality.”7

    I think we can all agree that for someone to be able to access public service and resources for their social security and realization of rights, they need to have basic information about those services and resources. This data is that very information.

    But it’s not just about the data itself. It’s about making the data discoverable and accessible, by which I mean understandable and useable by all people. To do that, we need more than CSVs on a thumbdrive or a call center that verbally gives this information out on the phone. We need infrastructure.

    The social services infrastructure landscape

    This may sound somewhat familiar to you. If so, you may be thinking about 2-1-1, which is a nationally-reserved hotline for people seeking human and social services assistance. 2-1-1s may have a national brand, but they are all locally or regionally managed, with over 70% run or funded by UnitedWay.

    Being decentralized and run by nonprofits, 2-1-1 is usually at least indirectly funded by taxpayers depending on local circumstances. They’re typically underfunded, and the quality of their services and data available to the public varies dramatically.

    Six years ago, Healthify founders working in community clinics felt that neither 2-1-1 resources nor the physical binders being manually maintained by fellow community health workers were good enough, so they set out to create their own database that could do it better. This story is pretty similar to how other vendors, such as UniteUS, got into this space: through personal experience with outdated, nonexistent, or poorly performing public infrastructure.

    Private (or third) sector innovation can start as a response to inadequate public infrastructure, and that’s okay.

    Today, the landscape for human and social services data currently looks consists of these major players:

    • 2-1-1s
    • Other community nonprofits addressing social service delivery
    • Vendors
    • Google (or other search engines)
    • Build-Your-Own by health systems seeking to address the social determinants of health

    It’s pretty competitive, which in some ways is a good thing for social workers and their clients. The competition pushes actors to have better data and build useful, usable software on top of it. However, because there’s no real shared infrastructure, they’re all doing redundant work. The amount of human and machine data verification and improvement that goes into maintaining a good community resource database is immense, and every actor here is doing it in a silo.

    Furthermore, because this data is necessary to uphold human rights, then the infrastructure supporting its delivery is also necessary to uphld human rights. This means that we can’t just rely on private actors, and ideally not on third sector actors either. Private actors shouldn’t be able to decide who gets access to this data and how. The people who produce or rely on the data – in other words, all of us – should own the data and its infrastructure; ergo, there needs to be a publicly owned actor.

    Possible versions of the world

    All of this can play out in different scenarios. I’ll illustrate three of them:

    The world we want:

    World We Want

    We should have a world where there is robust publicly owned infrastructure that community members and vendors alike can use, participate in, and benefit from. I don’t think the private sector should have a blank slate to using public services for profit; there are business and partnership models that are economical for businesses and ensure that public services are being paid for their business value.

    Note: In this diagram, I put a nice icon of a Greek-inspired building – what I’ve been told is the universally recognized symbol for government – next to 2-1-1 to illustrate that, while it isn’t currently, 2-1-1 (or infrastructure like it) should be publicly owned, and publicly owned usually means integrated into government.

    The world we don’t want:

    World We Don't Want

    We shouldn’t have a world without publicly owned infrastructure. Without publicly owned infrastructure like this, for-profit companies take on the ethical burden of upholding human rights – and come on, we know they’re not very good at that – and nonprofits have to pick up that mantle without viable motive or means to do so well.

    The world we should all be afraid of:

    World We Should All Be Afraid Of

    The world we should all be afraid of is one where Google/Alphabet or Amazon or another massive company replaces public infrastructure. For-profit companies are motivated by profit, not public good, and are certainly not motivated to serve all a community’s residents but rather only the ones with dollars, usually at the expense of those worth dollars. Furthermore, when a single for-profit company holds a monopoly on infrastructure, they are more likely to hold a monopoly (or at least a choke-collar) on innovation that uses that infrastructure, unless there is policy enforced to prevent this.8

    Getting to the world we want

    It starts by agreeing on what services and service infrastructure is necessary to uphold human rights. That itself starts with us agreeing on what human rights are, but luckily in the US we have this thing called the Bill of Rights, and in the world we have this thing called the Universal Declaration of Human Rights. We agree on human rights, so let’s focus more energy on figuring out how to uphold them.

    Once we’ve done that and identified what services are necessary for human rights, we need to build public infrastructure that the public owns.

    Public infrastructure tech

    On the technical side, our public infrastructure needs to use open standards for data exchange so tools are easier to build to use public services and underlying data. This increases access and innovation because it enables any actor, public or private or third-sector, to participate and get value out of the data. We also need to empower public service agencies to be digitally literate and maintain good quality levels of service with their infrastructure.

    In the social services landscape, Open Referral has been spearheading infrastructure innovation for years, and is increasingly gaining traction. They organize a working group and maintain the open Human Services Data Specification and related API spec.

    Open Referral’s innovation isn’t just technical, but also about people and business. They help 2-1-1s and public entities understand and test out business models and the tech that can support them – which leads me to my next point.

    Public infrastructure sustainability

    A key part of all this is making public infrastructure not only viable but sustainable. To do this, we need to change our approach to public/private partnerships with that focus on building accessible infrastructure, and we need to help the public (and third) sector develop business models of their own to make providing those services to commercial entities sustainable.

    Public infrastructure policy

    We also need policy to prevent the actors in the private sector from becoming integral yet still profit-driven and privately held pieces of that public infrastructure.9 I’m not saying we need to remove the private sector or profit motives from the equation, but we have to empower the public sector to innovate, to build or buy infrastructure thoughtfully and ethically, and to create partnerships with the private sector that are advantageous for the public, not just the private.

    1 Unless the shutdown continues much longer.
    2 Jokes! Some of y’all have molars!
    3 The difference is explained here.
    4 I’m not going to argue this point here, although I recognize that at the end of the day these orgs are always thinking about funding
    5 Geez, I feel like I’m talking about the four horsemen of the apocalypse. The fourth horseman in this case is the B-corp. WTF even is that.
    6 To be clear, I really like Healthify and think they're doing awesome work.
    7 I replaced "his" with the gender neutral "their."
    8 Net Neutrality is a great recent example of this debate. I recommend this articleon it from IEEE.
    9 The phrase "too big to fail" comes to mind here.