Artificial intelligence and intellectual property reto hilty - Download the ebook now and own the fu by Education Libraries

https://ebookmass.com/product/artificial-intelligence-andintellectual-property-reto-hilty/

Instant digital products (PDF, ePub, MOBI) ready for you

Download now and discover formats that fit your needs...

Precision Health and Artificial Intelligence Arjun Panesar

https://ebookmass.com/product/precision-health-and-artificialintelligence-arjun-panesar/ ebookmass.com

Artificial Intelligence for Neurological Disorders Abraham

https://ebookmass.com/product/artificial-intelligence-forneurological-disorders-abraham/

ebookmass.com

Artificial Intelligence for Dummies 2nd Edition John Paul Mueller & Luca Massaron

https://ebookmass.com/product/artificial-intelligence-for-dummies-2ndedition-john-paul-mueller-luca-massaron/

ebookmass.com

Connecting Gospels: Beyond the Canonical/Non-Canonical Divide Francis Watson

https://ebookmass.com/product/connecting-gospels-beyond-the-canonicalnon-canonical-divide-francis-watson/ ebookmass.com

Television and Radio Announcing, 12th Edition – Ebook PDF Version

https://ebookmass.com/product/television-and-radio-announcing-12thedition-ebook-pdf-version/

ebookmass.com

An Essay Concerning Human Understanding John Locke

https://ebookmass.com/product/an-essay-concerning-human-understandingjohn-locke/

ebookmass.com

Abnormal Psychology 2nd Edition – Ebook PDF Version

https://ebookmass.com/product/abnormal-psychology-2nd-edition-ebookpdf-version/

ebookmass.com

Moran's Dictionary of Chemical Engineering Practice Sean Moran

https://ebookmass.com/product/morans-dictionary-of-chemicalengineering-practice-sean-moran/

ebookmass.com

Rules, Contracts and Law Enforcement in the Ottoman Empire: The Case of Tax-Farming Contracts 1st Edition Bora Altay

https://ebookmass.com/product/rules-contracts-and-law-enforcement-inthe-ottoman-empire-the-case-of-tax-farming-contracts-1st-edition-boraaltay/

ebookmass.com

https://ebookmass.com/product/managing-it-projects-how-topragmatically-deliver-projects-for-external-customers-marcindabrowski/

ebookmass.com

Artificial Intelligence and Intellectual Property

Edited by JYH- AN LEE, RETO M HILTY, AND KUNG- CHUNG LIU

Great Clarendon Street, Oxford, OX2 6DP, United Kingdom

Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries

The moral rights of the authors have been asserted

First Edition published in 2021

Impression: 1

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above

You must not circulate this work in any other form and you must impose this same condition on any acquirer

Crown copyright material is reproduced under Class Licence Number C01P0000148 with the permission of OPSI and the Queen’s Printer for Scotland

Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America

British Library Cataloguing in Publication Data

Data available

Library of Congress Control Number: 2020944786

ISBN 978–0–19–887094–4

DOI: 10.1093/oso/9780198870944.001.0001

Printed and bound by CPI Group (UK) Ltd, Croydon, CR0 4YY

Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

List of Contributors

Feroz Ali is a legal consultant with the Vision Realization Office, Ministry of Health, Saudi Arabia. He is a visiting professor at National Law School of India University, Bangalore, his alma mater, where he teaches the course, ‘Regulating Artificial Intelligence’. He was the first Intellectual Property Rights (IPR) Chair Professor at the Indian Institute of Technology (IIT) Madras, where he worked closely with the IPM Cell and the Dean ICSR in managing their IPR portfolio. He offers online courses in intellectual property law on the NPTEL/ Swayam platform. As an advocate, he has represented clients before the Patent Office, Intellectual Property Appellate Board, and High Courts and Supreme Court of India. He founded Techgrapher.com, a platform for managing intellectual property and LexCampus. in, a portal that trains patent professionals. He has authored three books on patent law. He is an alumnus of Trinity College, University of Cambridge, and Duke University School of Law.

Hao-Yun Chen is an Assistant Professor at the College of Law at National Taipei University in Taiwan. She holds an LLD from Nagoya University in Japan. Prior to joining National Taipei University, she taught at National Taiwan University of Science and Technology. She teaches and researches in the field of intellectual property law, with special emphasis on the enforcement of patent rights, trademark protection and unfair competition law, copyright issues arising from new technology, and the relation between intellectual property law and competition law. Before starting her academic career, she worked as an associate attorney in a law firm in Taiwan.

Conrado Freitas is an in-house trademark counsel. Prior to that, he has contributed to several IP-related research projects as a Research Assistant at the Institute for Globalisation and International Regulation—IGIR, at Maastricht University. He holds an LLM in Intellectual Property Law and Knowledge Management from Maastricht University. He also holds a bachelor’s degree in Law from the Federal University of Rio de Janeiro, Brazil, and has completed several IP courses from WIPO and Brazilian IP Associations. Conrado is a qualified lawyer in Brazil with over seven years’ experience in total in the intellectual property field.

Andres Guadamuz is a Senior Lecturer in Intellectual Property Law at the University of Sussex and the Editor-in-Chief of the Journal of World Intellectual Property. His main research areas are: artificial intelligence and copyright, open licensing, cryptocurrencies, and smart contracts. Andres has published two books, the most recent of which is Networks, Complexity and Internet Regulation, and he regularly blogs at Technollama.co.uk. He has acted as an international consultant for the World Intellectual Property Organization (WIPO), and has done activist work with Creative Commons.

Andrew Fang Hao Sen is a family physician and a medical informatics practitioner at SingHealth. He has over ten years’ experience in the medical field. He graduated from the Yong Loo Lin School of Medicine, National University of Singapore (NUS), and furthered his medical education to obtain a Master of Medicine (Family Medicine). He also holds a Master of Technology (Enterprise Business Analytics) from NUS and graduated with the IBM Medal and Prize. He also lectures at the NUS Department of Statistics and Applied Probability. He is passionate about exploring and using technology to improve healthcare delivery, and his research focus is on healthcare analytics for clinical decision support.

Tianxiang He is Assistant Professor at the School of Law, City University of Hong Kong, where he serves as the Associate Director of the LLM Programme. He holds an LLB degree (Huaqiao University, China, 2007) and a master’s degree in International Law (Jinan University, China, 2009). He received his PhD in IP law at Maastricht University (the Netherlands, 2016) and another PhD in Criminal Law at Renmin University of China (2017). He is the author of Copyright and Fan Productivity in China: A Cross-jurisdictional Perspective (Springer 2017). He is an Elected Associate Member of the International Academy of Comparative Law (IACL). His research focuses on comparative intellectual property law, law and technology, and the intersections between the regulative framework of cultural products and fundamental rights such as freedom of speech.

Reto M Hilty is Director at the Max Planck Institute for Innovation and Competition in Munich, Germany. He is a full Professor ad personam at the University of Zurich and Honorary Professor at the Ludwig Maximilian University of Munich. In 2019, he received an Honorary Doctorate from the University of Buenos Aires. He specializes in intellectual property and competition law with a further focus on IP-specific contract law. Moreover, his research centres on the impact of new technologies and business models on intellectual property rights and the European and international harmonization of intellectual property law.

Jörg Hoffmann is a Junior Research Fellow at the Max Planck Institute for Innovation and Competition, Munich, a doctoral candidate at the Ludwig Maximilian University of Munich, and a fully qualified lawyer in Germany. He studied law at UCL and the Ludwig Maximilian University of Munich, where he obtained his law degree with a specialization in European Law and Public International Law. His main research interests lie in the fields of intellectual property and competition law with a major focus on the implications of the digital economy on the regulatory framework pertaining to innovation and competition in data-driven markets.

Ivan Khoo Yi is currently a doctor in Singapore’s primary healthcare industry. He was educated within the Singapore school system and graduated from the prestigious Yong Loo Lin School of Medicine, National University of Singapore in 2010. During his time in medicine, he received an award as one of the best house officers during his houseman year, and went on to practice otorhinolaryngology for a brief period. In 2016, seeking a new challenge, he

enrolled in the Singapore Management University Juris Doctor programme, and graduated in 2019 with summa cum laude. His education in law has fundamentally changed his outlook towards the practice of medicine.

Jyh-An Lee is an Associate Professor of Law at the Chinese University of Hong Kong, where he currently serves as the Assistant Dean for Undergraduate Studies and Director of the LLB Programme. He holds a JSD from Stanford Law School and an LLM from Harvard Law School. Prior to joining the Chinese University of Hong Kong, he taught at National Chengchi University and was an Associate Research Fellow in the Center for Information Technology Innovation at Academia Sinica in Taiwan. He was the Legal Lead and Co-Lead of Creative Commons Taiwan (2011–14) and an advisory committee member for Copyright Amendment in the Taiwan Intellectual Property Office (TIPO) at the Ministry of Economic Affairs (2011–14). He has been the Legal Lead of the Creative Commons Hong Kong Chapter since October 2018. Before starting his academic career, he was a practising lawyer in Taiwan, specializing in technology and business transactions.

Matthias Leistner is Professor of Private Law and Intellectual Property Law, with Information and IT-Law (GRUR Chair) at LMU Munich. He studied law in Berlin, Brussels, Munich, and Cambridge, obtaining his PhD at the Max Planck Institute Munich, Max Planck Institute for Innovation and Competition in Munich; Dr. iur., LMU Munich 1999, and LLM, University of Cambridge 2004. He completed his Habilitation (Post-doc thesis) at LMU Munich 2006. Apart from his Chair at LMU Munich, at present, he is a Member of the Faculty of the Munich Intellectual Property Law Center (MIPLC), and a guest professor for European Intellectual Property Law at the University of Xiamen, China, and at the Tongji University, Shanghai. He was an International Short Term Visiting Professor at Columbia Law School in the Spring Term 2020. His areas of expertise are intellectual property law, unfair competition law, and data and information law.

Jianchen Liu holds the position of Research Associate at ARCIALA, School of Law, Singapore Management University. He is also a PhD candidate at Renmin University of China, majoring in IP law and an LLM candidate at Columbia Law School. He focuses his research on the intersection between AI and IP law, as well as competition law and data protection. To date, he has published over ten articles on these topics in several law journals. Prior to pursuing his academic career, he worked as an IP lawyer for a world-renowned US law firm and a leading Chinese law firm for three years.

Kung-Chung Liu is Lee Kong Chian Professor of Law (Practice) and founding Director of Applied Research Centre for Intellectual Assets and the Law in Asia (ARCIALA) of Singapore Management University. He is also a professor of Renmin University of China and the Graduate Institute of Technology, Innovation and Intellectual Property Management, National Chengchi University, Taiwan. His teaching and research interests are intellectual property law, antitrust and unfair competition law, communications law, and the interface between those disciplines, with a geographic focus on greater China and Asia.

Ming Liu serves as Head of the Research Division at the Patent Re-examination Board (PRB) of the National Intellectual Property Administration of China (CNIPA). He is a highlevel member, a second-level examiner, and an expert of the Standing Panel for Examining Matters at CNIPA. He has significant experience in patent examination and invalidation. He has worked for CNIPA since 2002 after obtaining his MS degree from Chinese Academy of Sciences, first as a patent examiner, then transferred to PRB in 2007, and since then has continued hearing patent invalidation cases, many of which have been highly influential, both domestically and overseas. He has also been involved in the legislative process of the Patent Law, and the Implementing Rules of Patent Law and Guidelines for Patent Examination. He has published twenty articles in foreign and domestic academic journals.

Eliza Mik is an Assistant Professor at the Chinese University of Hong Kong Faculty of Law. She holds a PhD in contract law from the University of Sydney. She has taught courses in contract law and in the law of e-commerce at the Singapore Management University and the University of Melbourne, as well as courses in FinTech and Blockchain at Bocconi University in Milan. In parallel with a line of research focused on distributed ledger technologies and smart contracts, she is involved in multiple projects relating to the legal implications of transaction automation. Eliza holds multiple academic affiliations, including those with the Tilburg Institute for Law, Society and Technology (TILT) and the Center for AI and Data Governance in Singapore. Before joining academia, Eliza worked in-house in a number of software companies, Internet start-ups, and telecommunication providers, where she advised on technology procurement, payment systems, and software licensing.

Anke Moerland is Assistant Professor of Intellectual Property Law in the European and International Law Department, Maastricht University. Her research relates to the interface of intellectual property law and political science, with a focus on governance aspects of intellectual property regulation in international trade negotiations, and more specifically in the area of geographical indications and trade mark law. Dr. Moerland holds degrees in law (Maastricht University) and international relations (Technical University Dresden), with a PhD in intellectual property protection in EU bilateral trade agreements from Maastricht University. Since 2017, she has coordinated the EIPIN Innovation Society, a four-year Horizon 2020 grant under the Marie Skłodowska Curie Action ITN-EJD. Since 2018, she has held a visiting professorship in Intellectual Property Law, Governance and Art at the School of Law, Centre for Commercial Law Studies of Queen Mary University of London.

Ichiro Nakayama is Professor of Law, Graduate School of Law, Hokkaido University. Before he joined Hokkaido University in 2019, he served as Associate Professor of School of Law at Shinshu University from 2005 to 2009 and Professor of School of Law at Kokugakuin University from 2009 to 2019. Prior to joining academia, Professor Nakayama originally joined the Ministry of International Trade and Industry (MITI) in 1989 and spent sixteen years in the Government of Japan, where he worked not only in intellectual property law and policies but also other fields including energy policies and national security policies. Professor Nakayama received an LLB in 1989 from the University of Tokyo, LLM in 1995 from the University of Washington, and MIA in 1997 from Columbia University. He

has published a number of articles in the field of intellectual property law with a focus on patent law.

Anselm Kamperman Sanders is Professor of Intellectual Property Law, Director of the Advanced Masters Intellectual Property Law and Knowledge Management (IPKM LLM/ MSc), and Academic Director of the Institute for Globalization and International Regulation (IGIR) at Maastricht University, the Netherlands. He acts as Academic Co-director of the Annual Intellectual Property Law School and IP Seminar of the Institute for European Studies of Macau (IEEM), Macau SAR, China and is Adjunct Professor at Jinan University Law School, Guangzhou, China. Anselm holds a PhD from the Centre for Commercial Law Studies, Queen Mary University of London, where he worked as a Marie Skłodowska-Curie Fellow before joining Maastricht University in 1995. For the UN he was member of the expert group for the World Economic and Social Survey 2018. Anselm sits as deputy judge in the Court of Appeal in The Hague, which has exclusive jurisdiction on patent matters.

Stefan Scheuerer is a Junior Research Fellow at the Max Planck Institute for Innovation and Competition, Munich, a doctoral candidate at the Ludwig Maximilian University of Munich (LMU), and a fully qualified lawyer in Germany. He is a member of the Max Planck Institute’s research group on the regulation of the digital economy. Previously, he studied law at LMU with a specialization in intellectual property law, competition law, and media law, and gained practical experience in these fields in the course of several internships, inter alia at the European Commission, DG Competition, Brussels. His main research interests lie in the fields of intellectual property law, unfair competition law, legal theory, and law and society, especially in the context of the digital economy.

Daniel Seng is an Associate Professor of Law and Director of the Centre for Technology, Robotics, AI & the Law (TRAIL) at NUS. He teaches and researches on information technology and intellectual property law. He graduated with firsts from NUS and Oxford and won the Rupert Cross Prize in 1994. His doctoral thesis with Stanford University involved the use of machine learning, natural language processing, and data analytics to analyse the effects and limits of automation on the DMCA takedown process. Dr. Seng is a special consultant to the World Intellectual Property Organization (WIPO) and has presented and published papers on differential privacy, electronic evidence, information technology, intellectual property, artificial intelligence, and machine learning at various local, regional, and international conferences. He has been a member of various Singapore government committees that undertook legislative reforms in diverse areas such as electronic commerce, cybercrimes, digital copyright, online content regulation, and data protection.

Peter R Slowinski is a Junior Research Fellow at the Max Planck Institute for Innovation and Competition in Munich. He is admitted as attorney-at-law (Rechtsanwalt) in Germany as well as a qualified mediator. In addition, he holds a Master of the Science of Law (JSM) from Stanford Law School after completing the Stanford Program in International Legal Studies (SPILS). He has given lectures at Stanford Law School and the Munich Intellectual Property Law Center (MIPLC). Until 2016, he practised as a patent litigator in infringement

and nullity proceedings in Germany. His research focuses on patents and dispute resolution. He has published on copyright and patent law. Mr. Slowinski has conducted an empirical study on mediation proceedings in patent law and participated in the SPC Study of the Max Planck Institute. He is a member of the research groups on data-driven economies and AI as well as life sciences.

Anthony Man-Cho So received his BSE degree in Computer Science from Princeton University with minors in Applied and Computational Mathematics, Engineering and Management Systems, and German Language and Culture. He then received his MSc and PhD degrees in Computer Science with a PhD minor in Mathematics from Stanford University. Professor So joined CUHK in 2007 and is currently Professor in the Department of Systems Engineering and Engineering Management. His research focuses on optimization theory and its applications in various areas of science and engineering, including computational geometry, machine learning, signal processing, and statistics. He has received a number of research and teaching awards, including the 2018 IEEE Signal Processing Society Best Paper Award, the 2016–17 CUHK Research Excellence Award, the 2013 CUHK ViceChancellor’s Exemplary Teaching Award, and the 2010 Institute for Operations Research and the Management Sciences (INFORMS) Optimization Society Optimization Prize for Young Researchers.

Benjamin Sobel is an Affiliate at Harvard University’s Berkman Klein Center for Internet & Society. His research and teaching examine the way digital media, artificial intelligence, and networked devices influence intellectual property, privacy, security, and expression. His article, ‘Artificial Intelligence’s Fair Use Crisis’, was among the first publications to comprehensively examine the intersection between machine learning technology and the fair use doctrine in US copyright law.

Shufeng Zheng is a research associate of Applied Research Centre for Intellectual Assets and the Law in Asia (ARCIALA) of Singapore Management University and a PhD student at the Peking University. Before this, she was a Research Assistant in Peking University Science and Technology Law Research Center, and obtained a master’s degree in Common Law from the University of Hong Kong and a master’s degree in Intellectual Property Law from the Peking University. Her research focuses on the protection of and access to data, copyright licence scheme, and patent protection for software-related inventions.

Raphael Zingg is an Assistant Professor at Waseda University, Institute for Advanced Study, Tokyo, and a Research Fellow at the ETH Zurich, Center for Law & Economics. He has worked as a visiting scholar at a number of foreign institutions, notably the University of California in Berkeley, the University of Hong Kong, Singapore Management University, and the Max Planck Institute for Innovation and Competition in Munich. His scholarly fields of interest include the study of the patent system, biotechnology, nanotechnology, artificial intelligence laws, and the protection of cultural heritage. He received his PhD from the ETH Zurich, and his degrees in law from the universities of Zurich, Fribourg, and Paris II.

Roadmap to Artificial Intelligence and Intellectual Property

An Introduction

Jyh-An Lee, Reto M Hilty, and Kung-Chung Liu

The Broader Picture and Structure of the Book

The ability of computers to imitate intelligent human behaviour has drawn wide attention in recent years; we humans are increasingly ceding our decision-making power to technological artefacts. With the advancement of data science and computing technologies, artificial intelligence (AI) has become omnipresent in all sectors of society. Face and speech recognition, visual perception, self-driving vehicles, surgical robots, and automated recommendations from social media are all well-known examples of how computer systems perform tasks that normally require human intelligence. From manufacturing to healthcare services, AI has already improved previous best practices. Based on large volumes of data, AI can predict more accurately than humanly possible. The overwhelming intellectual power of AI is also exemplified by AlphaGo and AlphaZero, which have taught themselves to beat the best human players of chess, Go, and Shogi.

AI also enables new models of creativity and innovation with its data-driven approach. While human beings have used various instruments and technologies to create and innovate, they themselves have been the main driving force of creativity and innovation. AI puts that into question, raising numerous challenges to the existing intellectual property (IP) regime. Traditionally, the ‘intellectual’ part of ‘intellectual property’ refers to human intellect. However, since machines have become intelligent and are increasingly capable of making creative, innovative choices based on opaque algorithms, the ‘intellectual’ in ‘intellectual property’ turns out to be perplexing. Existing human-centric IP regimes based on promoting incentives and avoiding disincentives may no longer be relevant—or even positively detrimental—if AI comes into play. Moreover, AI has sparked new issues in IP law regarding legal subjects, scope, standards of protection, exceptions, and relationships between actors.

This book proceeds in seven parts, each of which is interconnected. Part I provides the technical, business, and economic foundations for the analysis of IP

Jyh-An Lee, Reto M Hilty, and Kung-Chung Liu, Roadmap to Artificial Intelligence and Intellectual Property In: Artificial Intelligence and Intellectual Property. Edited by: Jyh-An Lee, Reto M Hilty, and Kung-Chung Liu, Oxford University Press (2021). © The several contributors. DOI: 10.1093/oso/9780198870944.003.0001

issues in the AI environment in the following parts of the book. Part II examines emerging substantive patent law and policy issues associated with AI, including foundational patents in AI-related inventions, the patentability of AI inventions, and how AI tools raise the standard of the inventive step. This part also illustrates how patent prosecution has evolved from material to textual to digital. Part III probes into two major copyright issues concerning AI’s involvement in creation: the copyrightability of AI-generated works and copyright exceptions for text and data mining (TDM). Parts II and III present various legal and policy concerns in patent law and copyright law, respectively. However, patent law, copyright law, and trademark law occasionally share the same conundrum caused by the rapid development of AI technologies.

From Parts IV to VII, the book covers issues relevant to multiple categories of IP. While AI has enhanced the efficiency of IP administration and enforcement, it has generated new problems yet to be solved. Therefore, Part IV explores how AI reshapes IP administration in the public sector and IP enforcement in the private sector. Part V examines copyright and patent protection for AI software, which is qualitatively different from traditional computer programs. While AI is implemented by software, the protection for such software per se has been ignored by the mainstream IP literature. Part VI discusses the protection of and access to data, which is the driving force of all AI inventions and applications. It further illustrates how IP law will interact with other fields of law, such as unfair competition law and personal data protection law, on various data-related issues. Part VII provides a broader picture of AI and IP, searching for solutions to fundamental inquiries, such as IP and competition policy in the era of AI and whether an AI should be viewed as a legal person.

Individual Chapters

AI is a catch-all term that covers cognitive computing, machine learning (ML), evolutionary algorithms, rule-based systems, and the process of engineering intelligent machines. Anthony Man-Cho So’s chapter provides essential knowledge for IP lawyers to understand AI technologies. ML is a core part of many AI applications, a process by which algorithms detect meaningful patterns in training data and use them for prediction or decision-making. The volume and quality of training data therefore always play crucial roles in the performance of AI applications. Because of their nested, non-linear structure, AI models are usually applied in a black-box manner. Consequently, AI systems’ ‘interpretability’ or ‘explainability’, ie, the degree to which a human observer can understand the cause of a decision by the system, has been concerning for policymakers as well as AI users. Sometimes, even an AI developer can neither fully understand an AI’s decision-making process nor predict its decisions or outputs. In supervised ML, a prediction rule can map an input (a data sample) to an expected output (a label). Currently, the most powerful

way to implement a prediction rule is an artificial neural network, which is inspired by biological networks. In contrast, unsupervised ML has neither labelling nor prediction rules. The goal of unsupervised learning is to uncover hidden structures in data. Other than supervised and unsupervised learning, reinforcement learning is another ML paradigm in which a software agent learns by interacting with its environment to achieve a certain goal.

As a powerful instrument for business growth and development, AI technologies serve almost all areas of business operation, from corporate finance to human resource management to digital marketing. Among many other business sectors, healthcare in particular exemplifies how AI has fundamentally reshaped the whole industry. Ivan Khoo Yi and Andrew Fang’s chapter illustrates AI’s impact on the industry’s main stakeholders: providers of healthcare, patients, pharmaceuticals, payers of healthcare (insurance companies and employees), and policymakers. While AI has enhanced the quality and effectiveness of medical examination, treatment, and overall medical services, it has also created challenges, such as mismatches between training data and operational data, opaque decision-making, privacy issues, and more.

IP regimes are designed to balance various economic interests and moral values. Ordinary IP policy concerns include, but are not limited to, incentives for creativity, technological innovation, economic development, dissemination of knowledge, and overall social welfare. Considering all these interests and values, the exclusivity and monopoly of IP rights can only be justified if their consequent social benefits outweigh their social costs. The same understanding is applicable to discussions of whether IP protection is desirable in AI markets. Based on mainstream economic theories of IP, Reto M Hilty, Jörg Hoffmann, and Stefan Scheuerer assess the necessity of IP protection for AI tools and AI outputs. While different AI tools and AI outputs might lead to different conclusions on this issue, the robust development in AI technology implies that IP may not be a necessary incentive to foster innovation in this field. Moreover, the underlying black box in AI systems possibly runs afoul of the disclosure theory in patent law.

Emerging technologies can be blocked by broad foundational patents, and AI is no exception. These upstream patents cover essential aspects of technologies and thus hamper downstream research or resulting products. Therefore, from a policy perspective, certain building blocks of future innovation should be free from such patent enclosure. Raphael Zing examines the foundational triadic AI patents filed with the United States Patent and Trademark Office (USPTO), the European Patent Office (EPO), and the Japanese Patent Office (JPO), illustrating major players’ intentions to acquire foundational patents in AI. He suggests that patent offices and courts can protect the AI field from patent enclosure by strictly applying patentability standards.

The opaque AI black box behind algorithms can lead to legal questions, especially when transparency and disclosure are legally required. For example, the disclosure of invention is required in patent applications in most jurisdictions. Ichiro

Nakayama uses the JPO’s 2019 Examination Handbook for Patent and Utility Model as an example to illustrate how the disclosure requirement in patent law is applied to AI inventions with inexplicable black boxes. He further explores how AI tools will affect the hypothetical person having ordinary skill in the art (PHOSITA) and the level of inventive step. Once we recognize that AI is a common tool used by the PHOSITA, the level of inventive step will rise dramatically because many scientific experiments will be much more easily completed with the assistance of AI.

Looking forward, AI may fundamentally change patent prosecution and the role of patent offices. While AI systems can easily determine the novelty, disclosure, and enablement of an invention, a patent office will only need to focus on whether the application meets the inventive-step requirement. Feroz Ali envisions this near future in which inventions are presented digitally to patent offices and patent prosecution becomes a decentralized, AI-enabled process.

The wide use of AI in creating works also challenges copyright policy’s goal of maintaining the balance between protecting creative works and allowing the public to use them. AI can currently produce music, paintings, poems, film scripts, and a wide variety of other works. It is legally perplexing whether these works are subject to copyright protection. While humans have deployed various tools and technologies to create, they have been the main sources of creativity in the history of copyright law. However, as well as human beings, AI can now also make creative decisions and generate creative works by learning from existing works. This development has precipitated a debate concerning the copyrightability of AI-generated works. Andres Guadamuz provides a comparative analysis of copyrightability and originality issues regarding AI-generated works by studying copyright laws and practices in Australia, China, the United Kingdom (UK), and the United States (US), among others. While copyright laws in most jurisdictions do not protect AI-generated works, it is noteworthy that such works may be protected by the computer-generated works provisions in the Copyright, Designs and Patents Act (CDPA) 1988 in the UK. Similar provisions appear in some common law jurisdictions, such as Ireland, New Zealand, Hong Kong, and South Africa. Jyh-An Lee focuses on policy and legal issues surrounding the output of AI and copyright protection of computer-generated works under the CDPA 1988. He argues that from both legal and policy perspectives, the UK and other jurisdictions with similar computer-generated work provisions in their copyright laws should reconsider their approaches to these works.

Using AI to create works inevitably involves the reproduction of data, which might be the copyrighted works of others. Therefore, copyright infringement risks appear when TDM techniques are used to ‘train’ AI. To foster AI development and ease AI developers’ concerns over copyright infringement, more and more jurisdictions have added TDM to their list of copyright exceptions. Notable examples include the UK 2014 amendment to the CDPA (1988), the German Copyright Law (2018), the Japan Copyright Law (2018), and the European Union (EU) Directive

on Copyright in the Digital Single Market (2019). While these TDM exceptions are subject to different application criteria, copyright liability should obviously not overburden the promising development of AI. Tianxiang He’s chapter explores the possible applications and legislation of TDM exceptions in China. After examining the copyright exceptions models in Japan, Korea, and Taiwan, he argues that a semi-open copyright exceptions model incorporating the essence of the Japanese Copyright Law is most suitable for China and its AI industry. Benjamin Sobel approaches the copyright limitation and exception of TDM from a different angle. He argues that TDM exceptions should be designed and applied based on the nature of training data. Sobel develops a novel taxonomy of training data and suggests that copyright law should only regulate market-encroaching uses of data.

As well as the application of substantive IP law, IP administration and enforcement have also been fundamentally reshaped by AI technology. Jianchen Liu and Ming Liu study China’s patent examination of AI-related inventions and its recent regulatory movements. China first amended its Guidelines for Patent Examinations (Guidelines) in 2017 to allow the patentability of software-implemented business methods. It also distinguished computer-program-implemented inventions from computer programs themselves. The 2019 amendment of the Guidelines further points out that if an AI-related patent application contains both technical and nontechnical plans, including algorithms and business methods, it will not be rejected directly because of the non-technical parts. Additionally, this chapter provides some examples of AI patents approved in China.

Anke Moerland and Conrado Freitas demonstrate how AI can be used to examine trademark applications and to assess prior marks in oppositions and infringement proceedings. Their chapter compares the functionality of AI-based algorithms used by trademark offices and evaluates the capability of these AI systems in applying legal tests for trademark examination. Their empirical findings reveal that only a few trademark offices are currently applying AI tools to assess the applications of trademark registration. Furthermore, no court has used AI to assist their judgement in a trademark case. Moerland and Freitas also identify AI’s limitation in implementing legal tests associated with subjective judgement, such as the distinctiveness of a trademark and the likelihood of confusion among the relevant public.

Compared to IP administration in the public sector, AI techniques are more widely adopted by trademark and copyright owners and online platforms in the digital environment. Daniel Seng’s chapter introduces various automated IPenforcement mechanisms adopted by platforms, such as Amazon, Alibaba, and eBay. These mechanisms include automated detection systems for counterfeits and automated takedown systems for content providers. While these automated techniques significantly enhance the efficiency of IP enforcement, they are not a panacea to curb piracy and counterfeiting activities in the online marketplace. Substantial transaction costs still prevent IP owners, online sellers, and platforms

from collaborating with each other to restrain piracy and counterfeiting activities. Moreover, technology-driven private ordering is potentially subject to manipulation because of information asymmetry between stakeholders. These problems may be reinforced by AI’s black-box method of processing data. Seng proposes legal reforms to address the problems underlying automated IP enforcement mechanisms.

As the core of AI, software has transformed into a generative tool capable of learning and self-correcting. Unlike traditional software with predesigned inputs and outputs, the operation of AI-related software has become a dynamic process. For example, evolutionary algorithms are one genre of AI software that generate and continuously test solutions for their fitness to accomplish a given task. Hao-Yun Chen refers to such software as ‘software 2.0’ in her chapter, which mainly focuses on copyright protection for a new generation of computer programs. In addition to how copyright doctrines, such as idea/expression dichotomy and authorship, can be applied to software 2.0, she explores whether natural-rights and utilitarian theories can justify copyright protection of the functional aspects of software 2.0. Peter Slowinski’s chapter discusses the general IP protection of AI software from a different angle by comparing the IP laws in the US and the EU. He investigates both copyright and patent protection of different parts of the AI software, which include algorithms, mathematical models, data, and overall applications.

The AI economy is a data-centric one because AI systems analyse enormous amounts of data. Therefore, access to and protection of data have been crucial to the development of the AI industry. Kung-Chung Liu and Shufeng Zheng classify data into three categories: data specifically generated for AI, big data, and copyright-protected data. While these three types are not mutually exclusive, each is subject to different protection and access issues governed by IP law, competition law, and personal data protection law. Moreover, data generated by the public and private sectors have different policy implications for their protection and access, which inevitably intertwine with other data policies, such as open data and competition policies. Matthias Leistner approaches the same issue from the EU perspective and argues that access to data is a more urgent issue than protection of data under the current IP regime. Like many other jurisdictions, databases with original selections or arrangements are protected by copyright law as compilations in the EU. Moreover, the EU Data Protection Directive has established a sui generis right for database makers. While both copyright law and the EU Data Protection Directive provide exceptions to exclusive rights, a more comprehensive infrastructural framework for data access and exchange is still desirable. Leistner’s chapter also evaluates possible reforms of access to data, including establishing sectorspecific access rights, requiring licences, and applying the fair, reasonable, and non-discriminatory (FRAND) terms to assure data users’ access rights.

AI algorithms are intricately woven into our economy and create a pervasively automated society. Anselm Kamperman Sanders approaches IP issues from a

broader perspective of human trust and governance. Because AI is a core component of the Fourth Industrial Revolution, AI-related IP has the potential to generate market dominance in the connected environment of sensory devices and datasets. Consequently, such IP will reshape market structures and trigger new competition policy concerns.

When machines begin to exhibit human-like intelligence, another legal puzzle appears: whether an AI should be recognized as a legal person, like a corporation. If the law identifies AIs as legal persons, they will be able to enjoy legal rights and bear legal obligations. AIs will also have the capacity to enter into agreements with other parties and to sue and be sued. AI personality has also become a commonly discussed issue in the IP literature. When AI plays a major role in creative and innovative activities and is referred to as a ‘non-human’ author or inventor, some suggest that it should be the legal subject that owns the IP in question. Likewise, when the deployment of AI involves IP-infringement risk, some contend that it should be held liable for infringement. Nonetheless, an AI as a legal person is not currently the mainstream viewpoint; the EPO and the USPTO have ruled that an AI system cannot be recognized as an inventor of a patent. Eliza Mik’s chapter explains why AIs should not be deemed legal persons based on their technological features and the nature of legal persons in our current legal system.

This book is a result of collaboration between two Asian academic institutions— the Applied Research Centre for Intellectual Assets and the Law in Asia (ARCIALA), the School of Law, Singapore Management University, and the Chinese University of Hong Kong Faculty of Law—and one European institution, the Max Planck Institute of Innovation and Competition. As a result, it might have distinctly Asian and European touches; however, the editors intend to elucidate the general challenges and opportunities faced by every jurisdiction in the era of AI. We believe all policy and legal analysis should be based on a correct understanding of the technology and the economics of innovation, and an ideal policy should facilitate human sovereignty over machine efficiency. By the same token, a desirable IP policy should enable society to fully grasp the value of new technologies for economic prosperity.

1 Technical Elements of Machine Learning for Intellectual Property Law

Anthony Man-Cho So*

1. Introduction

Although the field of artificial intelligence (AI) has been around for more than sixty years, its widespread influence is a rather recent (within the past decade or so) phenomenon. From human face recognition to artificial face generation, from automated recommendations on online platforms to computer-aided diagnosis, from game-playing programs to self-driving cars, we have witnessed the transformative power of AI in our daily lives. As it turns out, machine learning (ML) techniques lie at the core of many of these innovations. ML is a sub-field of AI that is concerned with the automated detection of meaningful patterns in data and using the detected patterns for certain tasks.1 Roughly speaking, the learning process involves an algorithm,2 which takes training data (representing past knowledge or experience) as input and outputs information that can be utilized by other algorithms to perform tasks such as prediction or decision-making. With the huge amount of data generated on various online platforms,3 the increasing power (in terms of both speed and memory) of computers, and advances in ML research, researchers and practitioners alike have been able to unleash the power of ML and contribute to the many impressive technologies we are using or experiencing today. In this chapter, I will give an overview of the key concepts and constructions in ML and,

* All online materials were accessed before 30 March 2020.

1 Shai Shalev-Shwartz and Shai Ben-David, Understanding Machine Learning: From Theory to Algorithms (Cambridge University Press 2014) (hereafter Shalev-Shwartz and Ben-David, Understanding Machine Learning).

2 An algorithm is a well-defined sequence of computational steps for solving a problem. Specifically, it takes zero or more values as inputs and applies the sequence of steps to transform them into one or more outputs. Note that an algorithm can be described in, say, the English language (which is easier for humans to understand) or in a programming language (which is easier for the computer to process). The word program refers to an expression of an algorithm in a programming language. See Donald E Knuth, The Art of Computer Programming. Volume I: Fundamental Algorithms, 3rd edn (Addison Wesley Longman 1997) for a more detailed discussion.

3 The data could be in the form of images and texts posted on social media, browsing and purchasing history on e-commerce sites, or emails sent and received using online email platforms, to name just a few.

Anthony Man-Cho So, Technical Elements of Machine Learning for Intellectual Property Law In: Artificial Intelligence and Intellectual Property. Edited by: Jyh-An Lee, Reto M Hilty, and Kung-Chung Liu, Oxford University Press (2021).

with an aim to make them more concrete, explain the roles they play in some of the contemporary applications. In addition, I will elucidate the ways human efforts are involved in the development of ML solutions, which I hope could facilitate the legal discussions on intellectual property issues. In recent years, there has been much interest in applying ML techniques to legal tasks such as legal prediction and classification of legal documents. However, the discussion of these applications is beyond the scope of this chapter.4

2. Main Types of Machine Learning

The outcome of any learning process depends, among other things, on the material from which the learner learns. As alluded to in the introduction, in the context of ML, the learning material comes in the form of training data. Since the training data in most applications of interest are too complex and too large for humans to process and reason about, the power of modern computers is harnessed to identify the patterns in and extract information from those data. A key characteristic of ML algorithms is that they can adapt to their training data. In particular, with better training data (in terms of volume and quality), these algorithms can produce outputs that have better performance for the tasks at hand. In order to distinguish among different ML tasks, it is common to classify them according to the nature of the training data and the learning process. In this section, I will describe three main types of ML tasks—namely supervised learning, unsupervised learning, and reinforcement learning and explain how they manifest themselves in various reallife applications.

2.1 Supervised Learning

Supervised learning refers to the scenario in which the training data contain certain information (commonly referred to as the label) that is missing in the test data (ie, data that have not been seen before), and the goal is to use the knowledge learned from the training data to predict the missing information in the test data. It has been successfully applied to various fields, such as credit risk assessment5 and medical imaging.6 To better understand the notion of supervised learning, let

4 Readers who are interested in some of the applications of ML in the legal field can refer to, eg, Harry Surden, ‘Machine Learning and Law’ (2014) 89 Washington Law Review 87.

5 Dinesh Bacham and Janet Yinqing Zhao, ‘Machine Learning: Challenges, Lessons, and Opportunities in Credit Risk Modeling’ (2017) 9 Moody’s Analytics Risk Perspectives: Managing Disruption 30.

6 Geert Litjens and others, ‘A Survey on Deep Learning in Medical Image Analysis’ (2017) 42 Medical Image Analysis 60.

label=5label=0label=4label=1label=9

label=2label=1label=3label=1label=4

label=3label=5label=3label=6label=1

Figure 1.1 Sample handwritten digits from the MNIST database with their corresponding labels

me highlight three of its key elements—preparation of training data, formulation of the learning task, and implementation of algorithmic solutions to perform the learning.

2.1.1 Preparation of training data

The word ‘supervised’ in ‘supervised learning’ comes from the fact that the training data contain information that guides, or supervises, the learning process. Typically, the information is supplied by humans (a process referred to as labelling the data). As such, it often requires substantial effort to prepare the training data for a supervised learning task.7 To illustrate the concepts of training data and test data in the supervised learning setting, consider the task of recognizing handwritten digits. The training data can be a collection of handwritten digit samples, each of which is labelled with its interpretation (ie, 0–9). Figure 1.1 shows a small portion of such a collection from the MNIST database.8 Any collection of handwritten digit samples that have not been labelled or seen before can then be the test data.

It is important to note that in general the label given to a data sample is not guaranteed to be correct. This can be caused, eg, by human error or by the ambiguity in the data sample itself. For instance, in labelling handwritten digit samples,

7 Nowadays, it is common to use crowdsourcing to get a large volume of data labelled. One manifestation of this is the use of CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) on various websites. Although the explicitly stated purpose of CAPTCHAs is to authenticate users as humans (to prove ‘I’m not a robot’), the responses given by human users provide information about the queries posed by CAPTCHAs (eg, identify the traffic lights in the image, transcribe the distorted words, etc), thus labelling the data in those queries in the process. See, eg, Luis von Ahn and others, ‘reCAPTCHA: Human-Based Character Recognition via Web Security Measures’ (2008) 321(5895) Science 1465, for a discussion.

8 Yann LeCun, Corinna Cortes, and Christopher JC Burges, ‘The MNIST Database of Handwritten Digits’ (2010) <http://yann.lecun.com/exdb/mnist/>.

Figure 1.2 An ambiguous handwritten digit: Is this a ‘0’ or ‘6’?

mistakes can occur when the handwritten digits are hardly legible (Figure 1.2). As the premise of supervised learning is to use the knowledge learned from the labels of the training data samples to predict the labels of the test data samples, the presence of incorrectly labelled data samples can adversely affect the outcome of the learning process.

2.1.2 Formulation of learning task

The prediction of the labels of the data samples relies on a prediction rule ie, a function that takes a data sample as input and returns a label for that sample as output. With this abstraction, the goal of supervised learning can be understood as coming up with a prediction rule that can perform well on most data samples. Here, the performance is evaluated by a loss function, which measures the discrepancy between the label returned by the prediction rule and the actual label of the data sample. The choice of the loss function is largely dictated by the learning task at hand and is commonly known.9

To achieve the aforementioned goal, a natural idea is to search among rules that minimize the loss function on the training data. In other words, we aim to find the rule that best fits our past knowledge or experience. However, without restricting the type of rules to search from, such an idea can easily lead to rules that perform poorly on the unseen test data. This phenomenon is known as over-fitting. As an illustration, consider the task of classifying data points on the plane into two categories. Figure 1.3 shows the training data, in which each data point is labelled by either a cross ‘×’ or a circle ‘○’ to indicate the category it belongs to. A prediction rule takes the form of a boundary on the plane, so that given any point, the side of the boundary on which the point falls will yield its predicted category. Given a boundary, a common way to measure its performance on the training data is to count the number of points that it misclassified. Naturally, the fewer misclassified points, the better the boundary.

Suppose that we do not restrict the type of boundaries we can use. Then, a boundary that misclassifies the fewest training data samples is given by the bolded curve in Figure 1.3a. Indeed, all the crosses are on the left of the curve, while all the circles are on the right. However, such a boundary fits the training data too well and is not well-suited for dealing with potential variations in the test data. In particular, it is more likely to return a wrong classification for a test data sample.

9 See, eg, Shalev-Shwartz and Ben-David, Understanding Machine Learning (n 1) for a discussion of different loss functions.

On the other hand, suppose that we restrict ourselves to using only straight-line boundaries. Then, the dotted line in Figure 1.3b yields the best performance among all straight lines in terms of the number of misclassified training data points. Although the dotted line incorrectly classifies some of those points (eg, there are two circles on the left and two crosses on the right of the line), it can better handle variations in the test data and is thus more preferred than the curved boundary in Figure 1.3a.

The above discussion highlights the necessity to choose the type of prediction rules that will be used to fit the training data. Such a choice depends on the learning task at hand and has to be made by human users before seeing the data. In general, there are many different types of prediction rules that one can choose from. Some examples include polynomial functions, decision trees, and neural networks of various architectures. A key characteristic of these different types of rules is that each type can be defined by a set of parameters. In other words, each choice of values for the parameters corresponds to one prediction rule of the prescribed type. For instance, in the classification example above, the straight-line boundaries used in Figure 1.3b, which are lines on the plane, can be described by two parameters—slope and intercept. As another illustration, let us consider neural networks, which constitute one of the most powerful and popular types of prediction rules in ML today. Roughly speaking, a neural network consists of nodes (representing neurons) linked by arrows. Each arrow has a weight and connects the output of a node (ie, the tail of the arrow) to the input of another node (ie, the head of the arrow). Each node implements a function whose input is given by a weighted sum of the outputs of all the nodes linked to it, where the weights are obtained from the corresponding arrows. The architecture of a neural network is specified by its nodes, the links between the nodes, and the functions implemented on the nodes.10 The weights on the links then constitute the parameters that

10 Shalev-Shwartz and Ben-David, Understanding Machine Learning (n 1).

Figure 1.3 Illustration of over-fitting in a classification task

Figure 1.4 A simple feedforward neural network

describe different neural networks with the same architecture. Some commonly used neural network architectures include autoencoders, convolutional neural networks (CNNs), feedforward networks, and recurrent neural networks (RNNs). Each of these architectures is designed for particular learning tasks.11

Figure 1.4 shows an example of a simple three-layer feedforward neural network. It takes three inputs, which are denoted by xy z ,, . The weight assigned to an arrow is given by the number next to it. All the nodes implement the same function, which is denoted by f(·).12 To get a glimpse of what is being computed at the nodes, let us focus on the shaded node. It has two inputs, one from the output of the first node in the first layer, the second from the output of the second node in the first layer. The former, which equals fx() , has a weight of 0.8; the latter, which equals fy() , has a weight of 0.6. Therefore, the output of the shaded node is computed as 0806 .. × () +× () fx fy . By assigning a different set of weights to the arrows, we obtain a different neural network with the same architecture. As an aside, one often sees the word ‘deep’ being used to describe neural networks nowadays. Loosely speaking, it simply refers to a neural network with many (say, more than two) layers.

11 Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning (MIT Press 2016), available at <http://www.deeplearningbook.org> (hereafter Goodfellow, Bengio, and Courville, Deep Learning).

12 Mathematically, a function can be regarded as specifying an input-output relationship. The dot ‘·’ in the notation ‘f(·)’represents a generic input to the function f . Given a number t as input, the function f returns the number ft() as output.

Once the human user specifies the type of prediction rules to use, the next step is to find the values of the parameters that minimize the loss function on the training data. This gives rise to a computational problem commonly known as loss minimization. By solving this problem, one obtains as output a prediction rule of the prescribed type that best fits the training data. The rule can then be integrated into other decision support tools to inform the decisions of human users.

2.1.3 Implementation of algorithmic solution

Loss minimization problems are typically solved by iterative algorithms. Starting from an initial choice of values for the parameters, which can be viewed as a point in space, these algorithms proceed by moving the point in a certain direction by a certain distance, and then repeat until a certain stopping criterion is met. Different algorithms have different rules for determining the direction and distance to use at each point and have different stopping criteria. Generally speaking, the directions and distances are designed in such a way that the values of the loss function evaluated at the points generated by the algorithm have a decreasing trend (recall that the goal is to minimize the loss function), and the algorithm stops when no further progress can be made. One popular iterative algorithm for solving loss minimization problems is the stochastic gradient method. At each step, the method moves the current point along a random direction that is generated based on the properties of the loss function, and the distance by which the point is moved is decreasing as the method progresses, so as to avoid overshooting the solution.13

Although algorithm design requires human efforts and it is natural for developers to protect their algorithms in some ways, the specifications (ie, the rules for choosing directions and distances, and the stopping criterion) of many iterative algorithms used in the ML community are public knowledge. Still, even after one settles on a particular iterative algorithm to solve the loss minimization problem at hand, the choice of initial values for the parameters (also known as the initialization) could affect the performance of the algorithm. To understand this phenomenon, let us consider the scenario shown in Figure 1.5. The points on the horizontal axis represent possible values of the parameter, and the curve represents the loss function L . One can think of the curve representing L as a mountain range and an iterative algorithm as a person hiking there without a map and who can only explore her immediate surroundings to decide on which way to go. The goal of loss minimization can then be understood as finding the lowest point on the mountain range. In Figure 1.5, this is the black dot corresponding to the parameter value w * and loss function value L w * () .

13 Sebastian Ruder, ‘An overview of gradient descent optimization algorithms’ (Sebastian Ruder, 19 January 2016) <https://arxiv.org/abs/1609.04747>.

Now, suppose that the hiker starts at the leftmost black dot on the mountain range. This corresponds to initializing the algorithm at the point w ′ whose loss function value is L w ′ () . To get to a lower point on the mountain range, the person will naturally walk down the valley until she reaches the point with value L w () . At this point, the hiker cannot reach a lower point on the mountain range without first going up. Since she does not have a full picture of the mountain range, she will be inclined to stop there. This is precisely the behaviour of most iterative algorithms— they will stop at a point when there is no other point with a lower loss function value nearby. However, it is clear that the point with value  w () is not the lowest one on the mountain range. In other words, by starting at the leftmost black dot, most iterative algorithms will stop at the sub-optimal point that corresponds to the value L w () . On the other hand, if the algorithm starts at the rightmost black dot, which corresponds to taking ′′ w as the initial point with loss function value  ′′ () w , then it will stop at the lowest point on the mountain range. The parameter value at this point is w * , which corresponds to the prediction rule that best fits the training data.

In view of the above, a natural question is how to find a good initialization for the learning task at hand. Although there are some general rules-of-thumb for choosing the initialization, finding a good one is very much an art and requires substantial human input and experience.14 Moreover, since the shape of the loss

14 To quote Goodfellow, Bengio, and Courville, Deep Learning (n 11) 293, ‘Modern initialization strategies are simple and heuristic. Designing improved initialization strategies is a difficult task because neural network optimization is not yet well understood.’

Figure 1.5 Effect of initialization

function depends on both the training data and the type of prediction rules used, an initialization that works well for one setting may not work well for another.

From the brief introduction of supervised learning above, it can be seen that the performance of the prediction rule obtained from a supervised learning process hinges upon three human-dependent factors: the quality of the training data (in particular, the informativeness of the labels), the type of prediction rules used to fit the training data (eg, the choice of a certain neural network architecture), and the algorithm (including its settings such as the initialization and the rule for finding the next point) used to solve the loss minimization problem associated with the learning task. As such, the prediction rules obtained by two different users will generally be different if they specify any of the above factors differently. Putting it in another way, it is generally difficult to reproduce the outcome of a supervised learning process without knowing how each of the above three factors is specified. In addition, the prediction rule obtained is often neither transparent nor interpretable. Indeed, a human cannot easily explain how an iterative algorithm combines the different features of the data to produce the prediction rule, or how the rule makes predictions, or why the rule makes a certain prediction for a given data sample. Such a black-box nature of the prediction rule limits our understanding of the learning task at hand and could have various undesirable consequences.15

2.2 Unsupervised Learning

Unlike supervised learning, in which the goal is to learn from the labels of the training data samples a rule that can predict the labels of the unseen test data samples as accurately as possible, unsupervised learning is concerned with the scenario where the training data samples do not have any labels and the goal, loosely speaking, is to uncover hidden structure in the data. Such a goal is based on the belief that data generated by physical processes are not random but rather contain information about the processes themselves.16 For instance, a picture taken by a camera typically contains a foreground and a background, and one can try to identify the backgrounds in image data for further processing. However, in an unsupervised learning task, there is no external guidance on whether the uncovered structure is correct or not, hence the word ‘unsupervised’. This is in contrast to supervised learning, where one can evaluate the accuracy of the prediction rule by comparing the predicted labels and actual labels of the data samples. Thus, one may

15 In recent years, there has been growing interest in interpretable ML, which concerns the design of ML systems whose outputs can be explained; see Christoph Molnar, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable (1st edn, Lulu 2019) for some recent advances in this direction (hereafter Molnar, Interpretable Machine Learning).

16 DeLiang Wang, ‘Unsupervised Learning: Foundations of Neural Computation—A Review’ (2001) 22(2) AI Magazine 101.