This is an edited version of an episode on The Water Data Podcast produced by Dalberg Advisors and Ashoka Trust for Research in Ecology and the Environment. The hosts, Veena Srinivasan and Nirat Bhatnagar, speak with Peter Gleick from Pacific Institute and Rohini Nilekani about the role and importance of water data, trends in the sector, and how to collaborate to address gaps in the water data ecosystem.
After I had started my vehicle for philanthropy, I began thinking about how to do something more strategic with the endowment when it suddenly came to me that water is the most key resource in India that affected people at large, as well as the economy and so many other things in its wake. This is where my journey began and in 2005 I founded Arghyam, a foundation focusing on water. I am glad we did because at that time, we were the only Indian foundation exclusively focused on water. The importance of data became clear pretty quickly. When we set up the open public resource called the India Water Portal, a knowledge platform where people could contribute, share, and discover knowledge about water, data had a lot to do with it. One of the first things I remember that we imported onto the India Water Portal was the 100 years of metadata on the Indian monsoon. We tried to present it in a more readable format, we experimented with some of that, and it has become one of the most important tools for researchers to use even today. So that was our first brush with data. Unfortunately, I am not a scientist so I come to data from a citizen’s point of view. I like to look at data, as does the team at Arghyam, in terms of how we can use data, especially data as a public good, to serve citizens and society better.
Agreeing, Gleick points out that data can be useful for science but if it is not useful for the public good as well then it is less valuable. He adds that there are many kinds of data – demographic data on population and changes in where people live; hydrologic data on water; climatic data; information on how water is used or the quality of water; or economics i.e. the price and value of water. All of these are critically important in order to address water because it is such a big issue. He gives the example of the debate on whether we can do without building dams – to address this question he started collecting data on the environmental impacts of dams, the water use, the size and power that they generated, the impacts on fisheries, etc. in order to analyze whether big dams were better or worse than a large number of small dams. It turns out that was not the right question, he says. The important question had to do with the impacts on communities and fisheries, and the way that dams were operated from a social and institutional perspective. It was not just the data, but the information that they were trying to get out of the data that was important.
Putting Data in People’s Hands
We know it is very important for people like us, who are practitioners, to understand the big data numbers, right? How much rain falls, how much water availability there is, and how much of that is usable, those kinds of things are important to understand. How many rivers flow in India – big numbers like that are very important to understand, but they do not particularly help ordinary citizens. But within a city, you can find such a huge difference between how much water rich people like us use every day, which goes up to 400 liters a day per person, and a person living in a slum, not very far from my house, who may not even get 30 liters per day. Those are the kind of numbers I am interested in understanding, and why that is so, and what can be done about it.
I can give a lot of examples on how, in our work over the last 15-16 years, we have tried to unpack some of the public government data sets to really be able to arrive at civil society pathways of action using that data. For example, we know that in the 60s and 70s, there was hardly any use of groundwater, it was only about 1%. Now we are using more groundwater in India than the US and China combined. And we have some 30-40 million bore wells, nobody knows the exact number, extracting groundwater. Now, knowing that this is such a crisis was great, but we were able to create data in a participatory designed groundwater programs where local communities, with some of our hydrogeologist partners like AquaDam, were able to collect data on their own aquifers and then develop social protocols on using that water, which was then understood to be limited and finite.
So, some really fascinating examples of better, more sustainable, and equitable use of water simply by putting data in the hands of people and giving them agency to understand how to collect it and how to monitor the water resource using simple data gathering tools. I will give another quick example. We worked with the Karnataka state government on a program very early in the Arghyam days, called ‘Suvarna Jala’, where they wanted to put rainwater harvesting times in all the schools of Karnataka. It seemed like a good idea, but then when we started collecting real data on the ground, we found that many of the schools did not need it or already had rainwater harvesting. Secondly, the teachers had no clue what the thing was, which had arrived during the summer holidays when they were not there and it was not possible for them to use it properly, without adequate training, etc. So, using that data, we went back to the government and they stopped the second part of the program so they could redesign the whole thing, saving a lot of money of the public exchequer in the bargain. So, we try to use data in that manner.
There has definitely been an evolution in water data, notes Gleick. For a long time, we collected a very narrow set of water data – information on rainfall, run-off data from rivers, and water quality data that only included components that engineers needed in order to build big dams and figure out how to take more water out of the system for human use. However, in the last few decades, or really in the last few years, he says that there has been an explosion of interest in water, or there has been a growing realization of the water crisis and its many different dimensions. There has been an increase in awareness and activities, not just by the scientific community but by local communities trying to figure out how to understand and address their water problems.
This has pushed the demand side for water data and it has resulted in positive changes, says Gleick, including collecting data on how much water we need to do certain kinds of things, data on the ecological impacts about water, data on the economics of what water is costing people and relating to the human right to water and whether we should be charging money for water at all. So these changes have also highlighted some of the big gaps in the data that we have not been collecting, he says. There has also been an evolution in the way we collect data. For example, satellites have collected new forms of data that have brought to light the nature of the groundwater crisis. Gleick mentions the GRACE satellites which measure and provide detailed information on the severity of the groundwater overdraft problem, which has spurred conversations around what we can do to protect and restore groundwater.
The Challenges of Collecting and Sharing Data
When it comes to the Bazaar or markets, the demand for understanding water and data on water must have skyrocketed because now it has become a key constraint in the supply chain, from manufacturing to the software industry. When you do not have water, you cannot produce anything, even if it does not appear to be related to water. So I know that the market demand for water data has shot up in the last few decades in India, and that is where some of the differences begin to appear, where today there are corporations that can acquire water data for their use and we do not even know how they use it. But today, private satellites can give markets access to data which the common citizen cannot and the state will not share, because data is an extremely political subject, especially in India where water is both a state and a union subject under the Constitution. How many times have there almost been state battles, over the sharing of data and the sharing of the water itself. So, water sharing is such an important political subject and the data in the public domain in India is very contested.
We all know that no state is waiting to put out absolutely true data on its rivers because these rivers are going to flow into other states, and all of the sharing has to then be talked about to political constituents, which in India’s case, we have even seen terrible riots over any political perception that one state is unfairly giving its water to another state. We all know how data is packaged and presented when there is this notion of zero sum games, of a resource that cannot be renewed when it needs to be. So that is a very important thing when it comes to the supply side of data, which is why it is crucial to have civil society institutions or citizens try their best to ground truth the public data, so that they can make local decisions at least with more accurate information. Often the state’s supply of data may not be at the granularity that is useful to citizens to act on their water problems.
For example, when it comes to water quality, fluoride or arsenic are two of the biggest contaminants in our groundwater here in India. We do not have the exact numbers but millions of people are affected, due to fluoride or arsenic, with serious health concerns. But the point is, if I am sitting in a taluka or a small zone somewhere in a generally arsenic-affected area, it does not automatically mean that all my water sources necessarily have arsenic. Sometimes, like we found out in our work in Balapur Puri, Orissa, though there was a lot of contamination, it was not reported in the government data sets. So then our partners had to work with the parliamentary representative and actually get Balapur represented in the MIS system so that they could get the funds to tackle the arsenic. So, we would love to see a world where there is data coming from the top, but there is also ground truthing and data contribution from below. The goal is for data to not always be flowing from one direction – top to bottom or bottom to top but, like water, have it flow in multiple directions. For us, it is really important to see how data can be used as empowerment of Samaaj, of society, because markets can easily empower themselves with data. The state, of course, sometimes has a monopoly over data, but what about citizens? If citizens have to use data so that they can develop more agency for themselves and their institutions, how do we restructure the idea of data and how do we, in this 21st century, flip the idea of data as something that either the state or the market uses to something that citizens can empower themselves with? And so we came up with a broad structure, which is a part of what we call Societal Platform Thinking, in which we allow for the principle of how to distribute the ability to solve instead of simply trying to solve a problem.
Gleick also notes that there is a difference between raw data in the form of numbers about a physical or geochemical factor and data that is ultimately useful for informing public policy and community decisions. For example, data on carbon dioxide tells us how much carbon is in the atmosphere, but it does not tell us where that carbon comes from, what the consequences of that carbon will be for climate change, or what the consequences of climate change will be for society, which is what we really care about. Similarly he says that water data tells us a physical or geochemical fact, but the things we really care about are the implications for human health, for example. Politics is a very important piece of this. For a long time, and even today, some water data was collected by governments and kept secret. They were considered national security issues and they were not shared because of concerns about what political neighbors might use the data for.
The public was rarely involved in collecting data, but now Gleick thinks that we are slowly seeing a change in that. The internet has certainly facilitated our ability to share information, satellites have launched, producing data that is much more accessible to the public that previously governments were able to keep and hold secret. So, the idea that data ought to be open source, which is something he believes in very strongly, is a really important one. And any tool that we can develop that promotes the sharing of data is important for making that data more useful. The other issue is granularity. It is important to know that hundreds of millions of people are exposed to concentrations of arsenic in certain parts of Asia that are unhealthy. This is really important from a policy point of view and for developing strategies for dealing with arsenic. But on a personal level, people want to know whether their water has arsenic in it and that requires a different granularity of data. It requires people to be able to test their water or for someone local to be able to test the water, share that information with communities, and then provide the resources to help them deal with that problem.
On the issue of the legitimacy of data, Gleick believes that it is important that data be trusted and verified and that governments should not be the sole arbiters of that. Governments should not be the only collectors of data, there should also be independent verification. That is a question for different legal systems – whose data is legitimate and what data is considered legitimate. To Gleick, we are entering an era where more individuals and citizens are able to collect and share data. When this data conflicts with official government data, that should raise alarm bells, and there has to be a way for there to be independent verification of data. Governments have failed to collect data that is important, or they have collected this important data and kept it a secret from citizens. The Internet is helping that, he says, and we are seeing more collection of data and widespread tools like our cell phones that are becoming instruments for collecting data. So there is a lot to be learned in this area.
For the government, especially to collect more data is becoming more vast. But in some cases, we must also point out that it works well. For example, after the tsunami in India, the government started really focusing on getting the right data, the right predictions, and the right modeling to tell us when the next threat is going to come. And we have seen for every next extreme weather event, there has been a very decent early warning system in place, which has actually trickled down to the local disaster management authorities. So we have been able to save thousands, if not hundreds of thousands of lives because of the good modeling, the predictions, and the data put out in public at the right time and for the right people who need to make quick decisions, especially when it comes to floods and cyclones. So, in that sense, there has been a huge improvement. I think there has been a lot more data put out in the public domain in the last few years in India. There might be a small trend reversing it. Some of my people in the field have been worried that perhaps there is a trend in the opposite direction, but if we could envision data as water data at the community level as an open public good, for people to discover, share action, and report back on what happens with their use of data, I think we could avoid many water conflicts.
Gleick agrees, giving the example of the water crisis in Flint, Michigan, where community and university groups started testing the water and found concentrations of lead and other contaminants, which led to big policy changes and the way water utility was managed. That data did not come from the government, it came from individuals and nonprofit organizations. Another example is how inexpensive air quality monitors are now available in the United States, so when there were severe wildfires in California over the last few years, people were able to use indoor air quality monitors to know when to use a mask or avoid going outside. These crowd-sourced data sets were publicly available, he says, and we are beginning to see more and more of these kinds of examples. We really need inexpensive devices like this, perhaps something similar for water where you could just test your own water. I know a lot of people are working on this. And then you should be able to put that data out and get these massive pictures of the quality of water around any community or state or nation. We are still waiting for the technologists and scientists to give us something very simple, with which we can measure not just bacteria but at least 8-10 indicators of water quality.
In Bangalore, for example, our lakes are a source of pride for us. They used to be a source of sustainable irrigation water in the previous days, but now Bangalore is a megalopolis and we mostly use our lakes to walk around and throw our sewage in. But, citizens have gotten very excited about lakes in the last decade, and they themselves are going around collecting data and often coming up against the civic bodies, saying “You said there is no sewage, but excuse me, here is proof there is sewage coming into my lake.” When the quality of demand rises in the public, there is no system that can withstand that pressure and it will have to yield. So, if we keep building the quality of demand for the data – for real, verifiable, maybe triangulated ground truth data on water, there is no system that cannot start to yield and either share or figure out solutions together with the community. So that is kind of a theory of change that we have to help prove out in the coming days, because there are so many questions, right? One is the quality of the data, the other is data standards. I mean, there seems to be so much confusion about how to collect data, and what are the standards by which we measure something like quality, quantity, accessibility, etc. And then there is the interoperability of that data, because you have your data set and I have mine, and if the twain will never meet, then we can not do the big picture analysis at all. And we are stuck between this lack of trusted quality of data, lack of common data standards, and lack of interoperability of data, right? So we need to work more and more towards getting that done, and in some scenarios at least, ask if water water data can be an exhaust of common activity rather than us having to constantly spend resources, human and financial, to collect data?
The Responsibility of the Samaaj, Sarkaar, and Bazaar
In terms of the role that the three sectors should play, Gleick believes that each has a different responsibility. It is the responsibility of governments to spend the money to build, for example, extensive remote sensing systems and satellite systems, to collect large-scale data that the public cannot collect. It is also the role of governments in general, to collect this data and make it available to the public, to the scientific community, to the academic community, and to the public service community to use the data the way they think is important. On the other hand, he thinks it is the role of communities to increasingly be clear about what is really important to them. Along with local community groups, they need to define what data ought to be collected, and then to help drive forward to collect that data. Satellites and remote-sensing platforms that collect data on the hydrological cycle are paid for with public money, and so all of that data is in the public domain – this is the principle that the United States follows and this is how it ought to be in general, he says.
Communities would love to be able to crowdsource data to create patterns so that they could complain about inefficiency or inadequacy of water. I do not think we are there yet, but it is beginning in some things like water quality for lakes and other small water bodies around communities. But the government does have an obligation to put out a lot of data that people can use, both for research and for action on the ground. As we look at climate change, just imagine how critical water data is going to be for people and governments who have to act when water-driven crises are coming to hit us, right? So it is going to be so important to have open, public, trusted, verifiable, interoperable, and discoverable data, for quick decision-making in a fast-changing water scenario. We also forget about the state of our oceans – that data is global public goods as well. And some people are working on how we can create global data sets on the state of the world’s oceans.
I am not an expert on this so I am talking from the Samaaj side, but people are now also discussing carbon markets, carbon funds, and cap, trade, tax etc. At some point, they are going to have to start thinking about water. I am not trying to commoditize water or any such thing, but we may have no choice but to look at innovative instruments of financial policy to look at water as well. I think we are going to have to do some innovation even in the market space and the pricing of water at some level, to be able to manage it better. I come from a people perspective first, but if we look at what is happening with climate change, for example, data and modeling is going to be so important even in the building of public infrastructure in a country like ours, where we have not finished building out our public infra, right? Now, if you are going to build coastal roads, what if there were some data or some modeling available to you to say, “Excuse me, do not spend 10,000 crores on one road near the coast because in 30 years it is going to be a stranded asset.” I mean that kind of data is necessary to be able to make good decisions, especially when a country like ours has to make tough choices on public infrastructure. Having good water and climate change data at least, would go such a long way in helping us make smarter decisions.
There is another that we have not discussed very much, and that is the private sector, Gleick points out. There is a growing interest in water sustainability in the corporate sector. There are both good and bad companies in this area, but a tremendous amount of water is used by the private sector to produce the goods and services that all of us demand. Years ago, very little water data was publicly available about how much water different sectors use, how it is used, whether it is used sustainably, or on the quality of the discharged water. The good news in this area, he says, is that there is a set of companies that are trying to be somewhat more responsible in the CSR space. They are understanding what their own water use is and working with local communities to make sure that they are efficient and not hurting the local communities in which they work. The more effort that is pushed in that area as well, the more we can help a piece of this problem, and many corporations still use a tremendous amount of water and do not report or measure what their own water use is and that continues to be a challenge.
If we could have more data collected at various levels on how much water is used for every unit of production of anything, it would make a difference. Many companies are now setting their own goals to keep reducing this in the whole supply chain, year on year, not just because they have suddenly become enlightened, but also because it is a strategic imperative to use less of a scarce and costly resource like water. So we are seeing a lot of innovation in this area in India and across the globe. Maybe there should be more sharing across the market sector as to how to increase water efficiency down the line.
If we take the long view, says Gleick, more water data is available, more communities are demanding certain kinds of water data and producing water data. The amount of good information that is available today is much greater than it was 20 years ago in the water world. We know more, and he believes this information is having an effect on public policy. We are slowly making a transition from the old way of thinking about water, partly because we have better data and better information about both the nature of the problem, but also the success of certain kinds of solutions. There are certainly enormous data gaps, he points out, and a tremendous amount of information that we do not collect on the basic hydrology of water, in the water quality, and in the economics and cost to communities, so there are lots of places where we could see improvements. However, Gleick does believe that we are moving in the right direction. The trick is how to move faster, collect the right kinds of information, and ensure that the information is used by politicians and policy makers.
In India, I think we do have more data out in the public domain, but how can we push for a more open public sharing of more data that is relevant for communities to act upon? It needs to be at the right granularity, it needs to be more trustworthy, and there should be more enabling policies to allow different sets of factors to collect and share data. So we basically need more open sharing of water data, and it should not flow in only one direction. Water data needs to flow in many directions. Gleick agrees, noting that we also need to have a better sense of what a truly sustainable water system looks like – what it really means to provide safe water and sanitation for every human on the planet; to support ecosystems and protect the natural environment; to protect the water that the natural environment requires as well; the proper role of economics in allocating and managing water; and the human right to water and what that means. If we put all these things together and we have a vision of what a sustainable water system is, then the kinds of data and information that we need to manage, protect and run that system will be clearer, and that will help us figure out what data we need to collect, how we need to share that data, and how we need to use the data to influence public policy, he says.
We can put data in the center of the conversation, but actually data for what? Data so that we can all have sustainable equitable water for all living systems on this planet.