Rules and ethics for use of AI by governments

The governments of British Columbia and Yukon, in Canada, have jointly issued a report (June 2021) about ethical use of AI in the public sector. It’s interesting to me as it covers issues of privacy and fairness, and in particular, the rights of people to question decisions derived from AI systems. The report notes that the public increasingly expects services provided by governments to be as fast and as personalized as services provided by online platforms such as Amazon — and this leads or will lead to increasing adoption of AI systems to aid in delivery of government services to members of the public.

The report’s concluding recommendations (pages 47–48) cover eight points (edited):

  1. Establish guiding principles for AI use: “Each public authority should make a public commitment to guiding principles for the use of AI that incorporate transparency, accountability, legality, procedural fairness and protection of privacy.”
  2. Inform the public: “If an ADS [automated decision system] is used to make a decision about an individual, public authorities must notify and describe how that system operates to the individual in a way that is understandable.”
  3. Provide human accountability: “Identify individuals within the public authority who are responsible for engineering, maintaining, and overseeing the design, operation, testing and updating of any ADS.”
  4. Ensure that auditing and transparency are possible: “All ADS should include robust and open auditing functionality with enhanced transparency measures for closed-source, proprietary datasets used to develop and update any ADS.”
  5. Protect privacy of individuals: “Wherever possible, public authorities should use synthetic or de-identified data in any ADS.” See synthetic data definition, below.
  6. Build capacity and increase education (for understanding of AI): This point covers “public education initiatives to improve general knowledge of the impact of AI and other emerging technologies on the public, on organizations that serve the public,” etc.; “subject-matter knowledge and expertise on AI across government ministries”; “knowledge sharing and expertise between government and AI developers and vendors”; development of “open-source, high-quality data sets for training and testing ADS”; “ongoing training of ADS administrators” within government agencies.
  7. Amend privacy legislation to include: “an Artificial Intelligence Fairness and Privacy Impact Assessment for all existing and future AI programs”; “the right to notification that ADS is used, an explanation of the reasons and criteria used, and the ability to object to the use of ADS”; “explicit inclusion of service providers to the same obligations as public authorities”; “stronger enforcement powers in both the public and private sector …”; “special rules or restrictions for the processing of highly sensitive information by ADS”; “shorter legislative review periods of 4 years.”
  8. Review legislation to make sure “oversight bodies are able to review AIFPIAs [see item 7 above] and conduct investigations regarding the use of ADS alone or in collaboration with other oversight bodies.”

Synthetic data is defined (on page 51) as: “A type of anonymized data used as a filter for information that would otherwise compromise the confidentiality of certain aspects of data. Personal information is removed by a process of synthesis, ensuring the data retains its statistical significance. To create synthetic data, techniques from both the fields of cryptography and statistics are used to render data safe against current re-identification attacks.”

The report uses the term automated decision systems (ADS) in view of the Government of Canada’s Directive on Automated Decision Making, which defines them as: “Any technology that either assists or replaces the judgement of human decision-makers.”

.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.

Using machine learning to uncover racist laws

A common use of machine learning is to train a model to identify a particular kind of document, or a particular characteristic in a document — and then sort a gigantic set of documents. This produces a much-reduced subset of all documents that match the desired criteria. There might be some false positives in the subset, but it still gives researchers or journalists a big jump forward by eliminating thousands of unwanted documents.

This kind of sorting goes well beyond a simple search for keywords.

Above: Screenshot from On the Books at lib.unc.edu

A great example has emerged from the University of North Carolina at Chapel Hill. On the Books: Jim Crow and Algorithms of Resistance is a project that includes a public plain-text collection of North Carolina laws (1866–1967) likely to be Jim Crow laws.

There is a public GitHub repo of the code used in this project. It includes a full walkthrough of the project’s workflow — data acquisition and cleaning, OCR, unsupervised and supervised classification, etc.

The base document set (the main corpus) consists of 96 volumes, with 53,515 chapters, having 297,790 sections (source).

The project’s title gives homage to Safiya Noble’s 2018 book Algorithms of Oppression: How Search Engines Reinforce Racism.

“State-based racial segregation laws were incredibly inconvenient, irregular, and, most importantly, unconstitutional.”

—William Sturkey, Ph.D.

A historical perspective on this data collection was provided by William Sturkey, a history professor at UNC, in “On the Books”: Machine Learning Jim Crow (September 2020). He says On the Books is “the first and most complete collection of all Jim Crow laws from a single American state.” He points to the difficulty of cataloging and studying all Jim Crow laws from any state “because there were just so many.”

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.

How might we regulate AI to prevent discrimination?

Discussions about regulation of AI, and algorithms in general, often revolve around privacy and misuse of personal data. Protections against bias and unfair treatment are also part of this conversation.

In a recent article in Harvard Business Review, lawyer Andrew Burt (who might prefer to be called a “legal engineer”) wrote about using existing legal standards to guide efforts at ensuring fairness in AI–based systems. In the United States, these include the Equal Credit Opportunity Act, the Civil Rights Act, and the Fair Housing Act.

Photo by Tingey Injury Law Firm on Unsplash

Burt emphasizes the danger of unintentional discrimination, which can arise from basing the “knowledge” in the system on past data. You might think it would make sense to train an AI to do things the way your business has done things in the past — but if that means denying loans disproportionately to people of color, then you’re baking discrimination right into the system.

Burt linked to a post on the Google AI Blog that in turn links to a GitHub repo for a set of code components called ML-fairness-gym. The resource lets developers build a simulation to explore potential long-term impacts of a machine learning decision system — such as one that would decide who gets a loan and who doesn’t.

In several cases, long-term analysis via simulations showed adverse unintended consequences that arose from decisions made by ML. These are detailed in a paper by Google researchers. We can see that determining the true outcomes of use of AI systems is not just a matter of feeding in the data and getting a reliable model to churn out yes/no decisions for a firm.

It makes me wonder about all the cheerleading and hype around “business solutions” offered by large firms such as Deloitte. Have those systems been tested for their long-term effects? Is there any guarantee of fairness toward the people whose lives will be affected by the AI system’s decisions?

And what is “fair,” anyway? Burt points out that statistical methods used to detect a disparate impact depend on human decisions about “what ‘fairness’ should mean in the context of each specific use case” — and also how to measure fairness.

The same applies to the law — not only in how it is written but also in how it is interpreted. Humans write the laws, and humans sit in judgment. However, legal standards are long established and can be used to place requirements on companies that produce, deploy, and use AI systems, Burt suggests.

  • Companies must “carefully monitor and document all their attempts to reduce algorithmic unfairness.”
  • They must also “generate clear, good faith justifications for using the models” that are at the heart of the AI systems they develop, use, or sell.

If these suggested standards were applied in a legal context, it could be shown whether a company had employed due diligence and acted responsibly. If the standards were written into law, companies that deploy unfair and discriminatory AI systems could be held liable and face penalties.

Creative Commons License
AI in Media and Society by Mindy McAdams is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Include the author’s name (Mindy McAdams) and a link to the original post in any reuse of this content.

.