John Nay

New York, New York, United States Contact Info
4K followers 500+ connections

Join to view profile

Experience & Education

  • Norm Ai

View John’s full experience

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • AIs could soon run businesses – it’s an opportunity to ensure these ‘artificial persons’ follow the law

    The Conversation

    If an LLC were operated by an AI, it would have to obey the law like any other LLC, and courts could order it to pay damages, or stop doing something by issuing an injunction. An AI tasked with operating the LLC and, among other things, maintaining proper business insurance would have an incentive to understand applicable laws and comply. Having minimum business liability insurance policies is a standard requirement that most businesses impose on one another to engage in commercial…

    If an LLC were operated by an AI, it would have to obey the law like any other LLC, and courts could order it to pay damages, or stop doing something by issuing an injunction. An AI tasked with operating the LLC and, among other things, maintaining proper business insurance would have an incentive to understand applicable laws and comply. Having minimum business liability insurance policies is a standard requirement that most businesses impose on one another to engage in commercial relationships.

    The incentives to establish AI-operated LLCs are there. Fortunately, we believe it is possible and desirable to do the work to embed the law – what has until now been human law – into AI, and AI-powered automated compliance guardrails.

    Other authors
    See publication
  • Artificial intelligence and interspecific law

    Science

    Several experts have warned about artificial intelligence (AI) exceeding human capabilities, a “singularity” at which it might evolve beyond human control. Whether this will ever happen is a matter of conjecture. A legal singularity is afoot, however: For the first time, nonhuman entities that are not directed by humans may enter the legal system as a new “species” of legal subjects. This possibility of an “interspecific” legal system provides an opportunity to consider how AI might be built…

    Several experts have warned about artificial intelligence (AI) exceeding human capabilities, a “singularity” at which it might evolve beyond human control. Whether this will ever happen is a matter of conjecture. A legal singularity is afoot, however: For the first time, nonhuman entities that are not directed by humans may enter the legal system as a new “species” of legal subjects. This possibility of an “interspecific” legal system provides an opportunity to consider how AI might be built and governed. We argue that the legal system may be more ready for AI agents than many believe. Rather than attempt to ban development of powerful AI, wrapping of AI in legal form could reduce undesired AI behavior by defining targets for legal action and by providing a research agenda to improve AI governance, by embedding law into AI agents, and by training AI compliance agents.

    Other authors
    See publication
  • LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

    NeurIPS

    The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals…

    The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning -- which distinguish between its many forms -- correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.

    See publication
  • ARB: Advanced Reasoning Benchmark for Large Language Models

    arXiv

    Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more challenging test than prior benchmarks, featuring problems in mathematics, physics, biology…

    Large Language Models (LLMs) have demonstrated remarkable performance on various quantitative reasoning and knowledge benchmarks. However, many of these benchmarks are losing utility as LLMs get increasingly high scores, despite not yet reaching expert performance in these domains. We introduce ARB, a novel benchmark composed of advanced reasoning problems in multiple fields. ARB presents a more challenging test than prior benchmarks, featuring problems in mathematics, physics, biology, chemistry, and law. As a subset of ARB, we introduce a challenging set of math and physics problems which require advanced symbolic reasoning and domain knowledge. We evaluate recent models such as GPT-4 and Claude on ARB and demonstrate that current models score well below 50% on more demanding tasks. In order to improve both automatic and assisted evaluation capabilities, we introduce a rubric-based evaluation approach, allowing GPT-4 to score its own intermediate reasoning steps. Further, we conduct a human evaluation of the symbolic subset of ARB, finding promising agreement between annotators and GPT-4 rubric evaluation scores.

    See publication
  • Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence

    Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence, and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test…

    Better understanding of Large Language Models' (LLMs) legal analysis abilities can contribute to improving the efficiency of legal services, governing artificial intelligence, and leveraging LLMs to identify inconsistencies in law. This paper explores LLM capabilities in applying tax law. We choose this area of law because it has a structure that allows us to set up automated validation pipelines across thousands of examples, requires logical reasoning and maths skills, and enables us to test LLM capabilities in a manner relevant to real-world economic lives of citizens and companies. Our experiments demonstrate emerging legal understanding capabilities, with improved performance in each subsequent OpenAI model release. We experiment with retrieving and utilising the relevant legal authority to assess the impact of providing additional legal context to LLMs. Few-shot prompting, presenting examples of question-answer pairs, is also found to significantly enhance the performance of the most advanced model, GPT-4. The findings indicate that LLMs, particularly when combined with prompting enhancements and the correct legal texts, can perform at high levels of accuracy but not yet at expert tax lawyer levels. As LLMs continue to advance, their ability to reason about law autonomously could have significant implications for the legal profession and AI governance.

    See publication
  • Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans

    Northwestern Journal of Technology and Intellectual Property, Volume 20

    We are currently unable to specify human goals and societal values in a way that reliably directs AI behavior. Law is a computational engine that converts opaque human values into legible and enforceable directives. Law Informs Code is the research agenda attempting to capture that complex computational process of human law, and embed it in AI. Similar to how parties to a legal contract cannot foresee every potential contingency of their future relationship, and legislators cannot predict all…

    We are currently unable to specify human goals and societal values in a way that reliably directs AI behavior. Law is a computational engine that converts opaque human values into legible and enforceable directives. Law Informs Code is the research agenda attempting to capture that complex computational process of human law, and embed it in AI. Similar to how parties to a legal contract cannot foresee every potential contingency of their future relationship, and legislators cannot predict all the circumstances under which their proposed bills will be applied, we cannot ex ante specify rules that provably direct good AI behavior. Legal theory and practice have developed arrays of tools to address these specification problems. For instance, legal standards allow humans to develop shared understandings and adapt them to novel situations. In contrast to more prosaic uses of the law (e.g., as a deterrent of bad behavior through the threat of sanction), leveraged as an expression of how humans communicate their goals, and what society values, Law Informs Code.

    We describe how the data generated by legal processes and the theoretical constructs and practices of law (methods of law-making, statutory interpretation, contract drafting, applications of standards, legal reasoning, etc.) can facilitate the robust specification of inherently vague human goals for AI. This helps with human-AI alignment and the local usefulness of AI. Toward society-AI alignment, we present a framework for understanding law as the applied philosophy of multi-agent alignment. Although law is partly a reflection of historically contingent political power - and thus not a perfect aggregation of citizen preferences - if properly parsed, its distillation offers a legitimate computational comprehension of societal values.

    See publication
  • Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards

    SSRN

    Artificial Intelligence (AI) is taking on increasingly autonomous roles, e.g., browsing the web as a research assistant and managing money. But specifying goals and restrictions for AI behavior is difficult. Similar to how parties to a legal contract cannot foresee every potential “if-then” contingency of their future relationship, we cannot specify desired AI behavior for all circumstances. Legal standards facilitate the robust communication of inherently vague and underspecified goals…

    Artificial Intelligence (AI) is taking on increasingly autonomous roles, e.g., browsing the web as a research assistant and managing money. But specifying goals and restrictions for AI behavior is difficult. Similar to how parties to a legal contract cannot foresee every potential “if-then” contingency of their future relationship, we cannot specify desired AI behavior for all circumstances. Legal standards facilitate the robust communication of inherently vague and underspecified goals. Instructions (in the case of language models, “prompts”) that employ legal standards will allow AI agents to develop shared understandings of the spirit of a directive that can adapt to novel situations, and generalize expectations regarding acceptable actions to take in unspecified states of the world. Standards have built-in context that is lacking from other goal specification languages, such as plain language and programming languages. Through an empirical study on thousands of evaluation labels we constructed from U.S. court opinions, we demonstrate that large language models (LLMs) are beginning to exhibit an “understanding” of one of the most relevant legal standards for AI agents: fiduciary obligations. Performance comparisons across models suggest that, as LLMs continue to exhibit improved core capabilities, their legal standards understanding will also continue to improve. OpenAI’s latest LLM has 78% accuracy on our data, their previous release has 73% accuracy, and a model from their 2020 GPT-3 paper has 27% accuracy (worse than random). Our research is an initial step toward a framework for evaluating AI understanding of legal standards more broadly, and for conducting reinforcement learning with legal feedback (RLLF).

    See publication
  • Large Language Models as Corporate Lobbyists

    SSRN

    We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. An autoregressive large language model (OpenAI’s text-davinci-003) determines if proposed U.S. Congressional bills are relevant to specific public companies and provides explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed…

    We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. An autoregressive large language model (OpenAI’s text-davinci-003) determines if proposed U.S. Congressional bills are relevant to specific public companies and provides explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of novel ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. We also benchmark the performance of the previous OpenAI GPT-3 model (text-davinci-002), which was the state-of-the-art model on many academic natural language tasks until text-davinci-003 was recently released. The performance of text-davinci-002 is worse than a simple benchmark. These results suggest that, as large language models continue to exhibit improved natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. Longer-term, if AI begins to influence law in a manner that is not a direct extension of human intentions, this threatens the critical role that law as information could play in aligning AI with humans. This Essay explores how this is increasingly a possibility. Initially, AI is being used to simply augment human lobbyists for a small proportion of their daily tasks. However, firms have an incentive to use less and less human oversight over automated assessments of policy ideas and the written communication to regulatory agencies and Congressional staffers. The core question raised is where to draw the line between human-driven and AI-driven policy influence.

    See publication
  • Climate-contingent Finance

    Berkeley Business Law Journal

    Although climate change adaptation could yield significant benefits under future climate scenarios, the uncertainty of those scenarios decreases the feasibility of proactively adapting, especially where local political consensus is required. However, projects could be underwritten by benefits paid for in climate scenarios they’re designed to address because other entities would like to hedge the financial risk of those scenarios and support climate resilience.

    Infrastructure projects can…

    Although climate change adaptation could yield significant benefits under future climate scenarios, the uncertainty of those scenarios decreases the feasibility of proactively adapting, especially where local political consensus is required. However, projects could be underwritten by benefits paid for in climate scenarios they’re designed to address because other entities would like to hedge the financial risk of those scenarios and support climate resilience.

    Infrastructure projects can be built to defend against more extreme climate change through upfront spending. These expenditures generate more climate resilience benefit under more extreme climate outcomes. The return on investment of the adaptation is a function of the level of climate change, so it's optimal for the adapting entity to finance adaptation with repayment also a function of the climate. It's also optimal for entities with financial downside under a more extreme climate to serve as an investing counter-party because they can obtain higher than market rates of return when they need it most.

    In this way, entities proactively adapting could reduce the risk they over-prepare and their investors could reduce the risk they under-prepare. This is superior to typical insurance because by investing in climate-contingent financial mechanisms the investors are not merely financially hedging but also outright helping prevent damage, and therefore creating value. Instead of buying insurance, they’re paying for defense. Both sides of the positive-sum transaction — physical and financial hedgers — can be made better off. This coordinates capital through time according to parties’ climate risk reduction capabilities and financial profiles.

    See publication
  • Aligning Artificial Intelligence with Humans through Public Policy

    SSRN

    Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of explicitly specifying the rewards that AI models should receive for all the actions they could take in all relevant states of the world. One possible solution, then, is to leverage the capabilities of AI models to learn those rewards implicitly from a rich source of…

    Given that Artificial Intelligence (AI) increasingly permeates our lives, it is critical that we systematically align AI objectives with the goals and values of humans. The human-AI alignment problem stems from the impracticality of explicitly specifying the rewards that AI models should receive for all the actions they could take in all relevant states of the world. One possible solution, then, is to leverage the capabilities of AI models to learn those rewards implicitly from a rich source of data describing human values in a wide range of contexts. The democratic policy-making process produces just such data by developing specific rules, flexible standards, interpretable guidelines, and generalizable precedents that synthesize citizens’ preferences over potential actions taken in many states of the world. Therefore, computationally encoding public policies to make them legible to AI systems should be an important part of a socio-technical approach to the broader human-AI alignment puzzle. This Essay outlines research on AI that learn structures in policy data that can be leveraged for downstream tasks. As a demonstration of the ability of AI to comprehend policy, we provide a case study of an AI system that predicts the relevance of proposed legislation to any given publicly traded company and its likely effect on that company. We believe this represents the “comprehension” phase of AI and policy, but leveraging policy as a key source of human values to align AI requires “understanding” policy. We outline what will be required to move toward that. Solving the alignment problem is crucial to ensuring that AI is beneficial both individually (to the person or group deploying the AI) and socially. As AI systems are given increasing responsibility in high-stakes contexts, integrating democratically-determined policy into those systems could align their behavior with human goals in a way that is responsive to a constantly evolving society.

    See publication
  • Environmental Impact Bonds: A common framework and looking ahead

    Environmental Research: Infrastructure and Sustainability

    A frequent barrier to addressing some of our world's most pressing environmental challenges is a lack of funding. Currently, environmental project funding largely comes from philanthropic and public sources, but this does not meet current needs. Increased coordination and collaboration between multiple levels and sectors of government, in addition to private sector funding, can help address the environmental funding challenge. New financial tools and strategies can enable this transition and…

    A frequent barrier to addressing some of our world's most pressing environmental challenges is a lack of funding. Currently, environmental project funding largely comes from philanthropic and public sources, but this does not meet current needs. Increased coordination and collaboration between multiple levels and sectors of government, in addition to private sector funding, can help address the environmental funding challenge. New financial tools and strategies can enable this transition and facilitate uptake of innovative solutions. One such mechanism, the Environmental Impact Bond (EIB), is an emerging financial tool with the potential to transform the environmental funding landscape. However, these financial instruments are not well understood or recognized beyond those actively involved in EIB projects or in the field of conservation finance. As EIBs gain momentum, there is a clear need for a common framework, including definitions and nomenclature, research needs, and outlook for the future. In this paper, we define EIB mechanics, elucidate the difference between EIBs and Green Bonds, and propose a common vocabulary for the field. Drawing on first-hand experience with the few EIBs which have been deployed, we review and assess lessons learned, trends, and paths for the future. Finally, we propose a set of future targets and discuss research goals for the field to unify around. Through this work, we identify a concrete set of research gaps and objectives, providing evidence for EIBs as one important tool in the environmental finance toolbox.

    See publication
  • Research Handbook on Big Data Law

    Edward Elgar

    Automated decision tools, which increasingly rely on machine learning (ML), are used in systems that permeate our lives. Examples range from systems for offering credit and employment, to serving advertising. We explore the relationship between generalizability and the division of labor between humans and machines in decision systems. An automated decision tool is generalizable to the extent that it produces outputs that are as correct as the outputs it produced on the data used to create it…

    Automated decision tools, which increasingly rely on machine learning (ML), are used in systems that permeate our lives. Examples range from systems for offering credit and employment, to serving advertising. We explore the relationship between generalizability and the division of labor between humans and machines in decision systems. An automated decision tool is generalizable to the extent that it produces outputs that are as correct as the outputs it produced on the data used to create it. The generalizability of a ML model depends on the training, data availability, and the underlying predictability of the outcome that it models. Ultimately, whether a tool’s generalizability is adequate for a particular decision system depends on how it is deployed, usually in conjunction with human adjudicators. Taking generalizability explicitly into account highlights important aspects of decision system design, as well as important normative trade-offs, that might otherwise be missed.

    See publication
  • Legal Informatics

    Cambridge University Press

    A theoretical and applied introduction to emerging legal technology and informatics.

    See publication
  • The Big Shift: From Monetary to Fiscal Policy

    SSRN

    This research note investigates why Congress is likely to gain importance with respect to financial market price impacts, relative to monetary policy. Economic theory that claims the primary constraint on government spending is inflation, rather than deficits, has rapidly moved into Overton window, freeing up fiscal policy to have nearly unlimited firepower for tackling underemployment, inequality, climate change, and healthcare. There's a potential Blue Wave eager to address social, economic…

    This research note investigates why Congress is likely to gain importance with respect to financial market price impacts, relative to monetary policy. Economic theory that claims the primary constraint on government spending is inflation, rather than deficits, has rapidly moved into Overton window, freeing up fiscal policy to have nearly unlimited firepower for tackling underemployment, inequality, climate change, and healthcare. There's a potential Blue Wave eager to address social, economic, and environmental issues through sweeping legislation and rule-making. There's less room left to advance monetary policy, given actions taken already. There are practically infinite levers to pull through new legislation and regulation – this is not the case with the blunt instrument of monetary policy. Proposed policies have specific sets of winners and losers, which leads to more dispersion in prices across companies, relative to the blunter monetary policy tools. This is even more pronounced with Democratic proposals. Within-day-within-sector spread between winner and loser public companies is larger with Democratic proposed policies over the past 6 years of data. More fiscal policy (and especially more of it coming from Democrats) could lead to more policy-driven winners and losers in the public markets.

    See publication
  • What are you implying? Deriving the market’s political predictions with policy impact indices

    SSRN

    Markets are the ultimate information processors. Emergent prices encapsulate the wisdom of the crowd and its views on events that could impact future value. Liquid securities are not directly linked to social and political events, but are affected by them. Isolating event-anticipating price changes can provide implied predictions of the events. We apply our methodology for isolating event-anticipating price changes to the 2016 and 2020 U.S. elections. As an election approaches, companies that…

    Markets are the ultimate information processors. Emergent prices encapsulate the wisdom of the crowd and its views on events that could impact future value. Liquid securities are not directly linked to social and political events, but are affected by them. Isolating event-anticipating price changes can provide implied predictions of the events. We apply our methodology for isolating event-anticipating price changes to the 2016 and 2020 U.S. elections. As an election approaches, companies that would be positively affected by the expected outcome are more likely to have a positive price impact compared to those that would be negatively affected by that outcome. With estimates of the potential impacts of a political party on every company based on the policies proposed by the parties, the spread between the returns of the long and short holdings can reveal the market’s expected outcome of the election. A simulation supports the hypothesis that our approach captures the intended political exposures. The Republican index outperformed the Democratic index in the lead-up to and, especially, during the aftermath of the 2016 election. Contrary to the consensus view prior to the election, these indices implied a Republican win. The Democratic index outperformed the Republican index in 2020 before the polls reflected the information.

    See publication
  • Natural Language Processing and Machine Learning for Law and Policy Texts

    Almost all law is expressed in natural language; therefore, natural language processing (NLP) is a key component of understanding and predicting law at scale. NLP converts unstructured text into a formal representation that computers can understand and analyze. The intersection of NLP and law is poised for innovation because there are (i.) a growing number of repositories of digitized machine-readable legal text data, (ii.) advances in NLP methods driven by algorithmic and hardware…

    Almost all law is expressed in natural language; therefore, natural language processing (NLP) is a key component of understanding and predicting law at scale. NLP converts unstructured text into a formal representation that computers can understand and analyze. The intersection of NLP and law is poised for innovation because there are (i.) a growing number of repositories of digitized machine-readable legal text data, (ii.) advances in NLP methods driven by algorithmic and hardware improvements, and (iii.) the potential to improve the effectiveness of legal services due to inefficiencies in its current practice.

    NLP is a large field and like many research areas related to computer science, it is rapidly evolving. Within NLP, this paper focuses primarily on statistical machine learning techniques because they demonstrate significant promise for advancing text informatics systems and will likely be relevant in the foreseeable future.

    First, we provide a brief overview of the different types of legal texts and the different types of machine learning methods to process those texts. We introduce the core idea of representing words and documents as numbers. Then we describe NLP tools for leveraging legal text data to accomplish tasks. Along the way, we define important NLP terms in italics and offer examples to illustrate the utility of these tools. We describe methods for automatically summarizing content (sentiment analyses, text summaries, topic models, extracting attributes and relations, document relevance scoring), predicting outcomes, and answering questions.

    See publication
  • Generalizability: Machine Learning and Humans-in-the-loop

    Automated decision tools, which increasingly rely on machine learning (ML), are used in decision systems that permeate our lives. Examples range from high-stakes decision systems for offering credit, university admissions and employment, to decision systems serving advertising. Here, we consider data-driven tools that attempt to predict likely behavior of individuals. The debate about ML-based decision-making has spawned an important multi-disciplinary literature, which has focused primarily on…

    Automated decision tools, which increasingly rely on machine learning (ML), are used in decision systems that permeate our lives. Examples range from high-stakes decision systems for offering credit, university admissions and employment, to decision systems serving advertising. Here, we consider data-driven tools that attempt to predict likely behavior of individuals. The debate about ML-based decision-making has spawned an important multi-disciplinary literature, which has focused primarily on fairness, accountability, and transparency. We have been struck, however, by the lack of attention to generalizability in the scholarly and policy discourse about whether and how to incorporate automated decision tools into decision systems.

    This paper explores the relationship between generalizability and the division of labor between humans and machines in decision systems. An automated decision tool is generalizable to the extent that it produces outputs that are as correct as the outputs it produced on the data used to create it. The generalizability of a ML model depends on the training process, data availability, and the underlying predictability of the outcome that it models. Ultimately, whether a tool’s generalizability is adequate for a particular decision system depends on how it is deployed, usually in conjunction with human adjudicators. Taking generalizability explicitly into account highlights important aspects of decision system design, as well as important normative trade-offs, that might otherwise be missed.

    See publication
  • Topic Modeling the President: Conventional and Computational Methods

    George Washington Law Review

    Law is generally represented through text, and lawyers have for centuries classified large bodies of legal text into distinct topics — they “topic model” the law. But large bodies of legal documents present challenges for conventional topic modeling methods. The task of gathering, reviewing, coding, sorting, and assessing a body of tens of thousands of legal documents is a daunting proposition. Recent advances in computational text analytics, a subset of the field of “artificial intelligence,”…

    Law is generally represented through text, and lawyers have for centuries classified large bodies of legal text into distinct topics — they “topic model” the law. But large bodies of legal documents present challenges for conventional topic modeling methods. The task of gathering, reviewing, coding, sorting, and assessing a body of tens of thousands of legal documents is a daunting proposition. Recent advances in computational text analytics, a subset of the field of “artificial intelligence,” are already gaining traction in legal practice settings such as e-discovery by leveraging the speed and capacity of computers to process enormous bodies of documents. Differences between conventional and computational methods, however, suggest that computational text modeling has its own limitations, but that the two methods used in unison could be a powerful research tool for legal scholars in their research as well.

    Our findings support the assessment that computational topic modeling, provided a sufficiently large corpus of documents is used, can provide important insights for legal scholars in designing and validating their topic models of legal text. To be sure, computational topic modeling used alone has its limitations, some of which are evident in our models, but when used along with conventional methods, it opens doors towards reaching more confident conclusions about how to conceptualize topics in law. Drawing from these results, we offer several use cases for computational topic modeling in legal research. At the front-end, researchers can use the method to generate better and more complete model hypotheses. At the back-end, the method can effectively be used, as we did, to validate existing topic models. And at a meta-scale, the method opens windows to test and challenge conventional legal theory. Legal scholars can do all of these without “the machines,” but there is good reason to believe we can do it better with them in the toolkit.

    Other authors
    See publication
  • Urban Water Conservation Policies in the United States

    Earth's Future

    Urban water supply systems in the United States are increasingly stressed as economic and population growth confront limited water resources. Demand management, through conservation and improved efficiency, has long been promoted as a practical alternative to building Promethean energy‐intensive water supply infrastructure. Some cities are making great progress at managing their demand, but study of conservation policies has been limited and often regionally focused. We present a hierarchical…

    Urban water supply systems in the United States are increasingly stressed as economic and population growth confront limited water resources. Demand management, through conservation and improved efficiency, has long been promoted as a practical alternative to building Promethean energy‐intensive water supply infrastructure. Some cities are making great progress at managing their demand, but study of conservation policies has been limited and often regionally focused. We present a hierarchical Bayesian analysis of a new measure of urban water conservation policy, the Vanderbilt Water Conservation Index, for 195 cities in 45 states in the contiguous United States. This study does not attempt to establish causal relationships but does observe that cities in states with arid climates tend to adopt more conservation measures. Within a state, cities with more Democratic‐leaning voting preferences and large and rapidly growing populations tend to adopt more conservation measures. Economic factors and climatic differences between cities do not correlate with the number of measures adopted, but they do correlate with the character of the measures, with arid cities favoring mandatory conservation actions and cities in states with lower real personal income favoring rebates for voluntary actions. Understanding relationships between environmental and societal factors and cities' support for water conservation measures can help planners and policy makers identify obstacles and opportunities to increase the role of conservation and efficiency in making urban water supply systems sustainable.

    See publication
  • Agricultural response to changes in water availability and temperature in the coterminous U.S.

    American Geophysical Union

    Future changes in temperature and water availability will significantly affect agricultural systems in the United States. We construct a new weighted county-level panel dataset for the coterminous United States to estimate the impact of temperature and water availability on agricultural health using multivariate fixed-effects panel regression models. Results show clear non-linearities in the impacts of future changes in temperature, precipitation, deficit and soil moisture on corn, wheat and…

    Future changes in temperature and water availability will significantly affect agricultural systems in the United States. We construct a new weighted county-level panel dataset for the coterminous United States to estimate the impact of temperature and water availability on agricultural health using multivariate fixed-effects panel regression models. Results show clear non-linearities in the impacts of future changes in temperature, precipitation, deficit and soil moisture on corn, wheat and soy. This research lays the groundwork for future analyses estimating non-linearities in ecosystem response to changes in temperature and water availability as well as for the exploration of regional variations in exposure to these changes.

    See publication
  • A machine-learning approach to forecasting remotely sensed vegetation health

    International Journal of Remote Sensing

    Drought threatens food and water security around the world, and this threat is likely to become more severe under climate change. High-resolution predictive information can help farmers, water managers, and others to manage the effects of drought. We have created an open-source tool to produce short-term forecasts of vegetation health at high spatial resolution, using data that are global in coverage. The tool automates downloading and processing Moderate Resolution Imaging Spectroradiometer…

    Drought threatens food and water security around the world, and this threat is likely to become more severe under climate change. High-resolution predictive information can help farmers, water managers, and others to manage the effects of drought. We have created an open-source tool to produce short-term forecasts of vegetation health at high spatial resolution, using data that are global in coverage. The tool automates downloading and processing Moderate Resolution Imaging Spectroradiometer (MODIS) data sets and training gradient-boosted machine models on hundreds of millions of observations to predict future values of the enhanced vegetation index. We compared the predictive power of different sets of variables (MODIS surface reflectance data and Level-3 MODIS products) in two regions with distinct agro-ecological systems, climates, and cloud coverage: Sri Lanka and California. Performance in California is higher because of more cloud-free days and less missing data. In both regions, the correlation between the actual and model predicted vegetation health values in agricultural areas is above 0.75. Predictive power more than doubles in agricultural areas compared to a baseline model.

    Other authors
    See publication
  • Predicting and understanding law-making with word vectors and an ensemble model

    PLOS ONE

    Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative…

    Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill’s sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment.

    See publication
  • Impact of seasonal forecast use on agricultural income in a system with varying crop costs and returns: an empirically-grounded simulation

    Access to seasonal climate forecasts can benefit farmers by allowing them to make more informed decisions about their farming practices. However, it is unclear whether farmers realize these benefits when crop choices available to farmers have different and variable costs and returns; multiple countries have programs that incentivize production of certain crops while other crops are subject to market fluctuations. We hypothesize that the benefits of forecasts on farmer livelihoods will be…

    Access to seasonal climate forecasts can benefit farmers by allowing them to make more informed decisions about their farming practices. However, it is unclear whether farmers realize these benefits when crop choices available to farmers have different and variable costs and returns; multiple countries have programs that incentivize production of certain crops while other crops are subject to market fluctuations. We hypothesize that the benefits of forecasts on farmer livelihoods will be moderated by the combined impact of differing crop economics and changing climate. Drawing upon methods and insights from both physical and social sciences, we develop a model of farmer decision-making to evaluate this hypothesis. The model dynamics are explored using empirical data from Sri Lanka; primary sources include survey and interview information as well as game-based experiments conducted with farmers in the field. Our simulations show that a farmer using seasonal forecasts has more diversified crop selections, which drive increases in average agricultural income. Increases in income are particularly notable under a drier climate scenario, when a farmer using seasonal forecasts is more likely to plant onions, a crop with higher possible returns. Our results indicate that, when water resources are scarce (i.e. drier climate scenario), farmer incomes could become stratified, potentially compounding existing disparities in farmers' financial and technical abilities to use forecasts to inform their crop selections. This analysis highlights that while programs that promote production of certain crops may ensure food security in the short-term, the long-term implications of these dynamics need careful evaluation.

    See publication
  • Betting and Belief: Prediction Markets and Attribution of Climate Change

    IEEE Press

    Despite much scientific evidence, a large fraction of the American public doubts that greenhouse gases are causing global warming. We present a simulation model as a computational test-bed for climate prediction markets. Traders adapt their beliefs about future temperatures based on the profits of other traders in their social network. We simulate two alternative climate futures, in which global temperatures are primarily driven either by carbon dioxide or by solar irradiance. These represent…

    Despite much scientific evidence, a large fraction of the American public doubts that greenhouse gases are causing global warming. We present a simulation model as a computational test-bed for climate prediction markets. Traders adapt their beliefs about future temperatures based on the profits of other traders in their social network. We simulate two alternative climate futures, in which global temperatures are primarily driven either by carbon dioxide or by solar irradiance. These represent, respectively, the scientific consensus and a hypothesis advanced by prominent skeptics. We conduct sensitivity analyses to determine how a variety of factors describing both the market and the physical climate may affect traders’ beliefs about the cause of global climate change. Market participation causes most traders to converge quickly toward believing the “true” climate model, suggesting that a climate market could be useful for building public consensus.

    Other authors
    See publication
  • Gov2Vec: Learning Distributed Representations of Institutions and Their Legal Text

    We compare policy differences across institutions by embedding representations of the entire legal corpus of each institution and the vocabulary shared across all corpora into a continuous vector space. We apply our method, Gov2Vec, to Supreme Court opinions, Presidential actions, and official summaries of Congressional bills. The model discerns meaningful differences between government branches. We also learn representations for more fine-grained word sources: individual Presidents and…

    We compare policy differences across institutions by embedding representations of the entire legal corpus of each institution and the vocabulary shared across all corpora into a continuous vector space. We apply our method, Gov2Vec, to Supreme Court opinions, Presidential actions, and official summaries of Congressional bills. The model discerns meaningful differences between government branches. We also learn representations for more fine-grained word sources: individual Presidents and (2-year) Congresses. The similarities between learned representations of Congresses over time and sitting Presidents are negatively correlated with the bill veto rate, and the temporal ordering of Presidents and Congresses was implicitly learned from only text. With the resulting vectors we answer questions such as: how does Obama and the 113th House differ in addressing climate change and how does this vary from environmental or economic perspectives? Our work illustrates vector-arithmetic-based investigations of complex relationships between word sources. We are extending this to create a comprehensive legal semantic map.

    See publication
  • Drought, Risk, and Institutional Politics in the American Southwest

    Although there are multiple causes of the water scarcity crisis in the American Southwest, it can be used as a model of the long-term problem of freshwater shortages that climate change will exacerbate. We examine the water-supply crisis for 22 cities in the extended Southwest of the United States and develop a unique, new measure of water conservation policies and programs. Convergent qualitative and quantitative analyses suggest that political conflicts play an important role in the…

    Although there are multiple causes of the water scarcity crisis in the American Southwest, it can be used as a model of the long-term problem of freshwater shortages that climate change will exacerbate. We examine the water-supply crisis for 22 cities in the extended Southwest of the United States and develop a unique, new measure of water conservation policies and programs. Convergent qualitative and quantitative analyses suggest that political conflicts play an important role in the transition of water-supply regimes toward higher levels of demand-reduction policies and programs. Qualitative analysis using institutional theory identifies the interaction of four types of motivating logics—development, rural preservation, environmental, and urban consumer—and shows how demand-reduction strategies can potentially satisfy all four. Quantitative analysis of the explanatory factors for the variation in the adoption of demand-reduction policies points to the overwhelming importance of political preferences as defined by Cook's Partisan Voting Index. We suggest that approaches to water-supply choices are influenced less by direct partisan disagreements than by broad preferences for a development logic based on supply-increase strategies and discomfort with demand-reduction strategies that clash with conservative beliefs.

    See publication
  • Application of Machine Learning to the Prediction of Vegatation Health

    International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences

    This project applies machine learning techniques to remotely sensed imagery to train and validate predictive models of vegetation health in Bangladesh and Sri Lanka. For both locations, we downloaded and processed eleven years of imagery from multiple MODIS datasets which were combined and transformed into two-dimensional matrices. We applied a gradient boosted machines model to the lagged dataset values to forecast future values of the Enhanced Vegetation Index (EVI). The predictive power of…

    This project applies machine learning techniques to remotely sensed imagery to train and validate predictive models of vegetation health in Bangladesh and Sri Lanka. For both locations, we downloaded and processed eleven years of imagery from multiple MODIS datasets which were combined and transformed into two-dimensional matrices. We applied a gradient boosted machines model to the lagged dataset values to forecast future values of the Enhanced Vegetation Index (EVI). The predictive power of raw spectral data MODIS products were compared across time periods and land use categories. Our models have significantly more predictive power on held-out datasets than a baseline. Though the tool was built to increase capacity to monitor vegetation health in data scarce regions like South Asia, users may include ancillary spatiotemporal datasets relevant to their region of interest to increase predictive power and to facilitate interpretation of model results. The tool can automatically update predictions as new MODIS data is made available by NASA. The tool is particularly well-suited for decision makers interested in understanding and predicting vegetation health dynamics in countries in which environmental data is scarce and cloud cover is a significant concern.

    See publication
  • Predicting Human Cooperation

    PloS ONE

    The Prisoner’s Dilemma has been a subject of extensive research due to its importance in understanding the ever-present tension between individual self-interest and social benefit. A strictly dominant strategy in a Prisoner’s Dilemma (defection), when played by both players, is mutually harmful. Repetition of the Prisoner’s Dilemma can give rise to cooperation as an equilibrium, but defection is as well, and this ambiguity is difficult to resolve. The numerous behavioral experiments…

    The Prisoner’s Dilemma has been a subject of extensive research due to its importance in understanding the ever-present tension between individual self-interest and social benefit. A strictly dominant strategy in a Prisoner’s Dilemma (defection), when played by both players, is mutually harmful. Repetition of the Prisoner’s Dilemma can give rise to cooperation as an equilibrium, but defection is as well, and this ambiguity is difficult to resolve. The numerous behavioral experiments investigating the Prisoner’s Dilemma highlight that players often cooperate, but the level of cooperation varies significantly with the specifics of the experimental predicament. We present the first computational model of human behavior in repeated Prisoner’s Dilemma games that unifies the diversity of experimental observations in a systematic and quantitatively reliable manner. Our model relies on data we integrated from many experiments, comprising 168,386 individual decisions. The model is composed of two pieces: the first predicts the first-period action using solely the structural game parameters, while the second predicts dynamic actions using both game parameters and history of play. Our model is successful not merely at fitting the data, but in predicting behavior at multiple scales in experimental designs not used for calibration, using only information about the game structure. We demonstrate the power of our approach through a simulation analysis revealing how to best promote human cooperation.

    Other authors
    See publication
  • Data-Driven Dynamic Decision Models

    IEEE

    This article outlines a method for automatically generating models of dynamic decision-making that both have strong predictive power and are interpretable in human terms. This is useful for designing empirically grounded agent-based simulations and for gaining direct insight into observed dynamic processes. We use an efficient model representation and a genetic algorithm-based estimation process to generate simple approximations that explain most of the structure of complex stochastic…

    This article outlines a method for automatically generating models of dynamic decision-making that both have strong predictive power and are interpretable in human terms. This is useful for designing empirically grounded agent-based simulations and for gaining direct insight into observed dynamic processes. We use an efficient model representation and a genetic algorithm-based estimation process to generate simple approximations that explain most of the structure of complex stochastic processes. This method, implemented in C++ and R, scales well to large data sets. We apply our methods to empirical data from human subjects game experiments and international relations. We also demonstrate the method's ability to recover known data-generating processes by simulating data with agent-based models and correctly deriving the underlying decision models for multiple agent models and degrees of stochasticity.

    Other authors
    • Jonathan Gilligan
    See publication
  • datafsm: Software for Estimating Finite State Machine Models from Data

    CRAN

    Software package for for automatically generating models of dynamic decision-making that both have strong predictive power and are interpretable in human terms.

    See publication
  • Participatory simulations of urban flooding for learning and decision support

    WSC '15: Proceedings of the 2015 Winter Simulation ConferenceDecember 2015 Pages 3174–3175

    Flood-control measures, such as levees and floodwalls, can backfire and increase risks of disastrous floods by giving the public a false sense of security and thus encouraging people to build valuable property in high-risk locations. More generally, nonlinear interactions between human land-use and natural processes can produce unexpected emergent phenomena in coupled human-natural systems (CHNS). We describe a participatory agent-based simulation of coupled urban development and flood risks…

    Flood-control measures, such as levees and floodwalls, can backfire and increase risks of disastrous floods by giving the public a false sense of security and thus encouraging people to build valuable property in high-risk locations. More generally, nonlinear interactions between human land-use and natural processes can produce unexpected emergent phenomena in coupled human-natural systems (CHNS). We describe a participatory agent-based simulation of coupled urban development and flood risks and discuss the potential of this simulation to help educate a wide range of the public---from middle- and high-school students to public officials---about emergence in CHNS and present results from two pilot studies.

    See publication
  • Predicting Cooperation and Designing Institutions: An Integration of Behavioral Data, Machine Learning, and Simulation

    IEEE

    Empirical game theory experiments attempt to estimate causal effects of institutional factors on behavioral outcomes by systematically varying the rules of the game with human participants motivated by financial incentives. I developed a computational simulation analog of empirical game experiments that facilitates investigating institutional design questions. Given the full control the artificial laboratory affords, simulated experiments can more reliably implement experimental designs. I…

    Empirical game theory experiments attempt to estimate causal effects of institutional factors on behavioral outcomes by systematically varying the rules of the game with human participants motivated by financial incentives. I developed a computational simulation analog of empirical game experiments that facilitates investigating institutional design questions. Given the full control the artificial laboratory affords, simulated experiments can more reliably implement experimental designs. I compiled a large database of decisions from a variety of repeated social dilemma experiments, developed a statistical model that predicted individual-level decisions in a held-out test dataset with 90% accuracy, and implemented the model in agent-based simulations where I apply constrained optimization techniques to designing games – and by theoretical extension, institutions – that maximize cooperation levels.

    See publication
  • A review of decision-support models for adaptation to climate change in the context of development

    Climate and Development

    In order to increase adaptive capacity and empower people to cope with their changing environment, it is imperative to develop decision-support tools that help people understand and respond to challenges and opportunities. Some such tools have emerged in response to social and economic shifts in light of anticipated climatic change. Climate change will play out at the local level, and adaptive behaviours will be influenced by local resources and knowledge. Community-based insights are essential…

    In order to increase adaptive capacity and empower people to cope with their changing environment, it is imperative to develop decision-support tools that help people understand and respond to challenges and opportunities. Some such tools have emerged in response to social and economic shifts in light of anticipated climatic change. Climate change will play out at the local level, and adaptive behaviours will be influenced by local resources and knowledge. Community-based insights are essential building blocks for effective planning. However, in order to mainstream and scale up adaptation, it is useful to have mechanisms for evaluating the benefits and costs of candidate adaptation strategies. This article reviews relevant literature and presents an argument in favour of using various modelling tools directed at these considerations. The authors also provide evidence for the balancing of qualitative and quantitative elements in assessments of programme proposals considered for financing through mechanisms that have the potential to scale up effective adaptation, such as the Adaptation Fund under the Kyoto Protocol. The article concludes that it is important that researchers and practitioners maintain flexibility in their analyses, so that they are themselves adaptable, to allow communities to best manage the emerging challenges of climate change and the long-standing challenges of development.

    Other authors
    See publication
  • The Influence of Seasonal Forecast Accuracy on Farmer Behavior: An Agent-Based Modeling Approach

    American Geophysical Union

    Seasonal climates dictate the livelihoods of farmers in developing countries. While farmers in developed countries often have seasonal forecasts on which to base their cropping decisions, developing world farmers usually make plans for the season without such information. Climate change increases the seasonal uncertainty, making things more difficult for farmers. Providing seasonal forecasts to these farmers is seen as a way to help buffer these typically marginal groups from the effects of…

    Seasonal climates dictate the livelihoods of farmers in developing countries. While farmers in developed countries often have seasonal forecasts on which to base their cropping decisions, developing world farmers usually make plans for the season without such information. Climate change increases the seasonal uncertainty, making things more difficult for farmers. Providing seasonal forecasts to these farmers is seen as a way to help buffer these typically marginal groups from the effects of climate change, though how to do so and the efficacy of such an effort is still uncertain. In Sri Lanka, an effort is underway to provide such forecasts to farmers. The accuracy of these forecasts is likely to have large impacts on how farmers accept and respond to the information they receive. We present an agent-based model to explore how the accuracy of seasonal rainfall forecasts affects the growing decisions and behavior of farmers in Sri Lanka. Using a decision function based on prospect theory, this model simulates farmers' behavior in the face of a wet, dry, or normal forecast. Farmers can either choose to grow paddy rice or plant a cash crop. Prospect theory is used to evaluate outcomes of the growing season; the farmer's memory of the level of success under a certain set of conditions affects next season's decision. Results from this study have implications for policy makers and seasonal forecasters.

    Other authors
    • John Jacobi
    • Jonathan Gilligan
    See publication

View John’s full profile

  • See who you know in common
  • Get introduced
  • Contact John directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named John Nay in United States