The platform shift to AI is well underway. And while it holds the promise of transforming work and giving organizations a competitive advantage, realizing those benefits isn’t possible without a culture that embraces curiosity, failure, and learning. Leaders are uniquely positioned to foster this culture within their organizations today in order to set their teams up for success in the future. When paired with the capabilities of AI, this kind of culture will unlock a better future of work for everyone.
As business leaders, today we find ourselves in a place that’s all too familiar: the unfamiliar. Just as we steered our teams through the shift to remote and flexible work, we’re now on the verge of another seismic shift: AI. And like the shift to flexible work, priming an organization to embrace AI will hinge first and foremost on culture.
The pace and volume of work has increased exponentially, and we’re all struggling under the weight of it. Leaders and employees are eager for AI to lift the burden. That’s the key takeaway from our 2023 Work Trend Index, which surveyed 31,000 people across 31 countries and analyzed trillions of aggregated productivity signals in Microsoft 365, along with labor market trends on LinkedIn.
Nearly two-thirds of employees surveyed told us they don’t have enough time or energy to do their job. The cause of this drain is something we identified in the report as digital debt: the influx of data, emails, and chats has outpaced our ability to keep up. Employees today spend nearly 60% of their time communicating, leaving only 40% of their time for creating and innovating. In a world where creativity is the new productivity, digital debt isn’t just an inconvenience — it’s a liability.
AI promises to address that liability by allowing employees to focus on the most meaningful work. Increasing productivity, streamlining repetitive tasks, and increasing employee well-being are the top three things leaders want from AI, according to our research. Notably, amid fears that AI will replace jobs, reducing headcount was last on the list.
Becoming an AI-powered organization will require us to work in entirely new ways. As leaders, there are three steps we can take today to get our cultures ready for an AI-powered future:
Choose curiosity over fear
AI marks a new interaction model between humans and computers. Until now, the way we’ve interacted with computers has been similar to how we interact with a calculator: We ask a question or give directions, and the computer provides an answer. But with AI, the computer will be more like a copilot. We’ll need to develop a new kind of chemistry together, learning when and how to ask questions and about the importance of fact-checking responses.
Fear is a natural reaction to change, so it’s understandable for employees to feel some uncertainty about what AI will mean for their work. Our research found that while 49% of employees are concerned AI will replace their jobs, the promise of AI outweighs the threat: 70% of employees are more than willing to delegate to AI to lighten their workloads.
We’re rarely served by operating from a place of fear. By fostering a culture of curiosity, we can empower our people to understand how AI works, including its capabilities and its shortcomings. This understanding starts with firsthand experience. Encourage employees to put curiosity into action by experimenting (safely and securely) with new AI tools, such as AI-powered search, intelligent writing assistance, or smart calendaring, to name just a few. Since every role and function will have different ways to use and benefit from AI, challenge them to rethink how AI could improve or transform processes as they get familiar with the tools. From there, employees can begin to unlock new ways of working.
AI will change nearly every job, and nearly every work pattern can benefit from some degree of AI augmentation or automation. As leaders, now is the time to encourage our teams to bring creativity to reimagining work, adopting a test-and-learn strategy to find ways AI can best help meet the needs of the business.
AI won’t get it right every time, but even when it’s wrong, it’s usefully wrong. It moves you at least one step forward from a blank slate, so you can jump right into the critical thinking work of reviewing, editing, or augmenting. It will take time to learn these new patterns of work and identify which processes need to change and how. But if we create a culture where experimentation and learning are viewed as a prerequisite to progress, we’ll get there much faster.
As leaders, we have a responsibility to create the right environment for failure so that our people are empowered to experiment to uncover how AI can fit into their workflows. In my experience, that includes celebrating wins as well as sharing lessons learned in order to help keep each other from wasting time learning the same lesson twice. Both formally and informally, carve out space for people to share knowledge — for example, by crowdsourcing a prompt guidebook within your department or making AI tips a standing agenda item in your monthly all-staff meetings. Operating with agility will be a foundational tenet of AI-powered organizations.
Become a learn-it-all
I often hear concerns that AI will be a crutch, offering shortcuts and workarounds that ultimately diminish innovation and engagement. In my mind, the potential for AI is so much bigger than that, and it will become a competitive advantage for those who use it thoughtfully. Those will become your most engaged and innovative employees.
The value you get from AI is only as good as what you put in. Simple questions will result in simple answers. But sophisticated, thought-provoking questions will result in more complex analysis and bigger ideas. The value will shift from employees who have all the right answers to employees who know how to ask the right questions. Organizations of the future will place a premium on analytical thinkers and problem-solvers who can effectively reason over AI-generated content.
At Microsoft, we believe a learn-it-all mentality will get us much farther than a know-it-all one. And while the learning curve of using AI can be daunting, it’s a muscle that has to be built over time — and that we should start strengthening today. When I talk to leaders about how to achieve this across their companies and teams, I tell them three things:
- Establish guardrails to help people experiment safely and responsibly. Which tools do you encourage employees to use, and what data is — and isn’t — appropriate to input. What guidelines do they need to follow around fact-checking, reviewing, and editing?
- Learning to work with AI will need to be a continuous process, not a one-time training. Infuse learning opportunities into your rhythm of business and keep employees up to date with the latest resources. For example, one team might block off Friday afternoons for learning, while another has monthly “office hours” for AI Q&A and troubleshooting. And think beyond traditional courses or resources. How can peer-to-peer knowledge sharing, such as lunch and learns or a digital hotline, play a role so people can learn from each other?
- Embrace the need for change management. Being intentional and programmatic will be crucial for successfully adopting AI. Identify goals and metrics for success, and select AI champions or pilot program leads to help bring the vision to life. Different functions and disciplines will have different needs and challenges when it comes to AI, but one shared need will be for structure and support as we all transition to a new way of working.
The platform shift to AI is well underway. And while it holds the promise of transforming work and giving organizations a competitive advantage, realizing those benefits isn’t possible without a culture that embraces curiosity, failure, and learning. As leaders, we’re uniquely positioned to foster this culture within our organizations today in order to set our teams up for success in the future. When paired with the capabilities of AI, this kind of culture will unlock a better future of work for everyone.
How to Train Generative AI Using Your Company’s Data
Many companies are experimenting with ChatGPT and other large language or image models. They have generally found them to be astounding in terms of their ability to express complex ideas in articulate language. However, most users realize that these systems are primarily trained on internet-based information and can’t respond to prompts or questions regarding proprietary content or knowledge.
Leveraging a company’s propriety knowledge is critical to its ability to compete and innovate, especially in today’s volatile environment. Organizational Innovation is fueled through effective and agile creation, management, application, recombination, and deployment of knowledge assets and know-how. However, knowledge within organizations is typically generated and captured across various sources and forms, including individual minds, processes, policies, reports, operational transactions, discussion boards, and online chats and meetings. As such, a company’s comprehensive knowledge is often unaccounted for and difficult to organize and deploy where needed in an effective or efficient way.
Emerging technologies in the form of large language and image generative AI models offer new opportunities for knowledge management, thereby enhancing company performance, learning, and innovation capabilities. For example, in a study conducted in a Fortune 500 provider of business process software, a generative AI-based system for customer support led to increased productivity of customer support agents and improved retention, while leading to higher positive feedback on the part of customers. The system also expedited the learning and skill development of novice agents.
Like that company, a growing number of organizations are attempting to leverage the language processing skills and general reasoning abilities of large language models (LLMs) to capture and provide broad internal (or customer) access to their own intellectual capital. They are using it for such purposes as informing their customer-facing employees on company policy and product/service recommendations, solving customer service problems, or capturing employees’ knowledge before they depart the organization.
These objectives were also present during the heyday of the “knowledge management” movement in the 1990s and early 2000s, but most companies found the technology of the time inadequate for the task. Today, however, generative AI is rekindling the possibility of capturing and disseminating important knowledge throughout an organization and beyond its walls. As one manager using generative AI for this purpose put it, “I feel like a jetpack just came into my life.” Despite current advances, some of the same factors that made knowledge management difficult in the past are still present.
The Technology for Generative AI-Based Knowledge Management
The technology to incorporate an organization’s specific domain knowledge into an LLM is evolving rapidly. At the moment there are three primary approaches to incorporating proprietary content into a generative model.
Training an LLM from Scratch
One approach is to create and train one’s own domain-specific model from scratch. That’s not a common approach, since it requires a massive amount of high-quality data to train a large language model, and most companies simply don’t have it. It also requires access to considerable computing power and well-trained data science talent.
One company that has employed this approach is Bloomberg, which recently announced that it had created BloombergGPT for finance-specific content and a natural-language interface with its data terminal. Bloomberg has over 40 years’ worth of financial data, news, and documents, which it combined with a large volume of text from financial filings and internet data. In total, Bloomberg’s data scientists employed 700 tokens, or about 350 billion words, 50 billion parameters, and 1.3 million hours of graphics processing unit time. Few companies have those resources available.
Fine-Tuning an Existing LLM
A second approach is to “fine-tune” train an existing LLM to add specific domain content to a system that is already trained on general knowledge and language-based interaction. This approach involves adjusting some parameters of a base model, and typically requires substantially less data — usually only hundreds or thousands of documents, rather than millions or billions — and less computing time than creating a new model from scratch.
Google, for example, used fine-tune training on its Med-PaLM2 (second version) model for medical knowledge. The research project started with Google’s general PaLM2 LLM and retrained it on carefully curated medical knowledge from a variety of public medical datasets. The model was able to answer 85% of U.S. medical licensing exam questions — almost 20% better than the first version of the system. Despite this rapid progress, when tested on such criteria as scientific factuality, precision, medical consensus, reasoning, bias and harm, and evaluated by human experts from multiple countries, the development team felt that the system still needed substantial improvement before being adopted for clinical practice.
The fine-tuning approach has some constraints, however. Although requiring much less computing power and time than training an LLM, it can still be expensive to train, which was not a problem for Google but would be for many other companies. It requires considerable data science expertise; the scientific paper for the Google project, for example, had 31 co-authors. Some data scientists argue that it is best suited not to adding new content, but rather to adding new content formats and styles (such as chat or writing like William Shakespeare). Additionally, some LLM vendors (for example, OpenAI) do not allow fine-tuning on their latest LLMs, such as GPT-4.
Prompt-tuning an Existing LLM
Perhaps the most common approach to customizing the content of an LLM for non-cloud vendor companies is to tune it through prompts. With this approach, the original model is kept frozen, and is modified through prompts in the context window that contain domain-specific knowledge. After prompt tuning, the model can answer questions related to that knowledge. This approach is the most computationally efficient of the three, and it does not require a vast amount of data to be trained on a new content domain.
Morgan Stanley, for example, used prompt tuning to train OpenAI’s GPT-4 model using a carefully curated set of 100,000 documents with important investing, general business, and investment process knowledge. The goal was to provide the company’s financial advisors with accurate and easily accessible knowledge on key issues they encounter in their roles advising clients. The prompt-trained system is operated in a private cloud that is only accessible to Morgan Stanley employees.
While this is perhaps the easiest of the three approaches for an organization to adopt, it is not without technical challenges. When using unstructured data like text as input to an LLM, the data is likely to be too large with too many important attributes to enter it directly in the context window for the LLM. The alternative is to create vector embeddings — arrays of numeric values produced from the text by another pre-trained machine learning model (Morgan Stanley uses one from OpenAI called Ada). The vector embeddings are a more compact representation of this data which preserves contextual relationships in the text. When a user enters a prompt into the system, a similarity algorithm determines which vectors should be submitted to the GPT-4 model. Although several vendors are offering tools to make this process of prompt tuning easier, it is still complex enough that most companies adopting the approach would need to have substantial data science talent.
However, this approach does not need to be very time-consuming or expensive if the needed content is already present. The investment research company Morningstar, for example, used prompt tuning and vector embeddings for its Mo research tool built on generative AI. It incorporates more than 10,000 pieces of Morningstar research. After only a month or so of work on its system, Morningstar opened Mo usage to their financial advisors and independent investor customers. It even attached Mo to a digital avatar that could speak out its answers. This technical approach is not expensive; in its first month in use, Mo answered 25,000 questions at an average cost of $.002 per question for a total cost of $3,000.
Content Curation and Governance
As with traditional knowledge management in which documents were loaded into discussion databases like Microsoft Sharepoint, with generative AI, content needs to be high-quality before customizing LLMs in any fashion. In some cases, as with the Google Med-PaLM2 system, there are widely available databases of medical knowledge that have already been curated. Otherwise, a company needs to rely on human curation to ensure that knowledge content is accurate, timely, and not duplicated. Morgan Stanley, for example, has a group of 20 or so knowledge managers in the Philippines who are constantly scoring documents along multiple criteria; these determine the suitability for incorporation into the GPT-4 system. Most companies that do not have well-curated content will find it challenging to do so for just this purpose.
Morgan Stanley has also found that it is much easier to maintain high quality knowledge if content authors are aware of how to create effective documents. They are required to take two courses, one on the document management tool, and a second on how to write and tag these documents. This is a component of the company’s approach to content governance approach — a systematic method for capturing and managing important digital content.
At Morningstar, content creators are being taught what type of content works well with the Mo system and what does not. They submit their content into a content management system and it goes directly into the vector database that supplies the OpenAI model.
Quality Assurance and Evaluation
An important aspect of managing generative AI content is ensuring quality. Generative AI is widely known to “hallucinate” on occasion, confidently stating facts that are incorrect or nonexistent. Errors of this type can be problematic for businesses but could be deadly in healthcare applications. The good news is that companies who have tuned their LLMs on domain-specific information have found that hallucinations are less of a problem than out-of-the-box LLMs, at least if there are no extended dialogues or non-business prompts.
Companies adopting these approaches to generative AI knowledge management should develop an evaluation strategy. For example, for BloombergGPT, which is intended for answering financial and investing questions, the system was evaluated on public dataset financial tasks, named entity recognition, sentiment analysis ability, and a set of reasoning and general natural language processing tasks. The Google Med-PaLM2 system, eventually oriented to answering patient and physician medical questions, had a much more extensive evaluation strategy, reflecting the criticality of accuracy and safety in the medical domain.
Life or death isn’t an issue at Morgan Stanley, but producing highly accurate responses to financial and investing questions is important to the firm, its clients, and its regulators. The answers provided by the system were carefully evaluated by human reviewers before it was released to any users. Then it was piloted for several months by 300 financial advisors. As its primary approach to ongoing evaluation, Morgan Stanley has a set of 400 “golden questions” to which the correct answers are known. Every time any change is made to the system, employees test it with the golden questions to see if there has been any “regression,” or less accurate answers.
Legal and Governance Issues
Legal and governance issues associated with LLM deployments are complex and evolving, leading to risk factors involving intellectual property, data privacy and security, bias and ethics, and false/inaccurate output. Currently, the legal status of LLM outputs is still unclear. Since LLMs don’t produce exact replicas of any of the text used to train the model, many legal observers feel that “fair use” provisions of copyright law will apply to them, although this hasn’t been tested in the courts (and not all countries have such provisions in their copyright laws). In any case, it is a good idea for any company making extensive use of generative AI for managing knowledge (or most other purposes for that matter) to have legal representatives involved in the creation and governance process for tuned LLMs. At Morningstar, for example, the company’s attorneys helped create a series of “pre-prompts” that tell the generative AI system what types of questions it should answer and those it should politely avoid.
User prompts into publicly-available LLMs are used to train future versions of the system, so some companies (Samsung, for example) have feared propagation of confidential and private information and banned LLM use by employees. However, most companies’ efforts to tune LLMs with domain-specific content are performed on private instances of the models that are not accessible to public users, so this should not be a problem. In addition, some generative AI systems such as ChatGPT allow users to turn off the collection of chat histories, which can address confidentiality issues even on public systems.
In order to address confidentiality and privacy concerns, some vendors are providing advanced and improved safety and security features for LLMs including erasing user prompts, restricting certain topics, and preventing source code and propriety data inputs into publicly accessible LLMs. Furthermore, vendors of enterprise software systems are incorporating a “Trust Layer” in their products and services. Salesforce, for example, incorporated its Einstein GPT feature into its AI Cloud suite to address the “AI Trust Gap” between companies who desire to quickly deploy LLM capabilities and the aforementioned risks that these systems pose in business environments.
Shaping User Behavior
Ease of use, broad public availability, and useful answers that span various knowledge domains have led to rapid and somewhat unguided and organic adoption of generative AI-based knowledge management by employees. For example, a recent survey indicated that more than a third of surveyed employees used generative AI in their jobs, but 68% of respondents didn’t inform their supervisors that they were using the tool. To realize opportunities and manage potential risks of generative AI applications to knowledge management, companies need to develop a culture of transparency and accountability that would make generative AI-based knowledge management systems successful.
In addition to implementation of policies and guidelines, users need to understand how to safely and effectively incorporate generative AI capabilities into their tasks to enhance performance and productivity. Generative AI capabilities, including awareness of context and history, generating new content by aggregating or combining knowledge from various sources, and data-driven predictions, can provide powerful support for knowledge work. Generative AI-based knowledge management systems can automate information-intensive search processes (legal case research, for example) as well as high-volume and low-complexity cognitive tasks such as answering routine customer emails. This approach increases efficiency of employees, freeing them to put more effort into the complex decision-making and problem-solving aspects of their jobs.
Some specific behaviors that might be desirable to inculcate — either though training or policies — include:
- Knowledge of what types of content are available through the system;
- How to create effective prompts;
- What types of prompts and dialogues are allowed, and which ones are not;
- How to request additional knowledge content to be added to the system;
- How to use the system’s responses in dealing with customers and partners;
- How to create new content in a useful and effective manner.
Both Morgan Stanley and Morningstar trained content creators in particular on how best to create and tag content, and what types of content are well-suited to generative AI usage.
“Everything Is Moving Very Fast”
One of the executives we interviewed said, “I can tell you what things are like today. But everything is moving very fast in this area.” New LLMs and new approaches to tuning their content are announced daily, as are new products from vendors with specific content or task foci. Any company that commits to embedding its own knowledge into a generative AI system should be prepared to revise its approach to the issue frequently over the next several years.
While there are many challenging issues involved in building and using generative AI systems trained on a company’s own knowledge content, we’re confident that the overall benefit to the company is worth the effort to address these challenges. The long-term vision of enabling any employee — and customers as well — to easily access important knowledge within and outside of a company to enhance productivity and innovation is a powerful draw. Generative AI appears to be the technology that is finally making it possible.
13 Principles for Using AI Responsibly
The competitive nature of AI development poses a dilemma for organizations, as prioritizing speed may lead to neglecting ethical guidelines, bias detection, and safety measures. Known and emerging concerns associated with AI in the workplace include the spread of misinformation, copyright and intellectual property concerns, cybersecurity, data privacy, as well as navigating rapid and ambiguous regulations. To mitigate these risks, we propose thirteen principles for responsible AI at work.
Love it or loath it, the rapid expansion of AI will not slow down anytime soon. But AI blunders can quickly damage a brand’s reputation — just ask Microsoft’s first chatbot, Tay. In the tech race, all leaders fear being left behind if they slow down while others don’t. It’s a high-stakes situation where cooperation seems risky, and defection tempting. This “prisoner’s dilemma” (as it’s called in game theory) poses risks to responsible AI practices. Leaders, prioritizing speed to market, are driving the current AI arms race in which major corporate players are rushing products and potentially short-changing critical considerations like ethical guidelines, bias detection, and safety measures. For instance, major tech corporations are laying off their AI ethics teams precisely at a time when responsible actions are needed most.
It’s also important to recognize that the AI arms race extends beyond the developers of large language models (LLMs) such as OpenAI, Google, and Meta. It encompasses many companies utilizing LLMs to support their own custom applications. In the world of professional services, for example, PwC announced it is deploying AI chatbots for 4,000 of their lawyers, distributed across 100 countries. These AI-powered assistants will “help lawyers with contract analysis, regulatory compliance work, due diligence, and other legal advisory and consulting services.” PwC’s management is also considering expanding these AI chatbots into their tax practice. In total, the consulting giant plans to pour $1 billion into “generative AI” — a powerful new tool capable of delivering game-changing boosts to performance.
In a similar vein, KPMG launched its own AI-powered assistant, dubbed KymChat, which will help employees rapidly find internal experts across the entire organization, wrap them around incoming opportunities, and automatically generate proposals based on the match between project requirements and available talent. Their AI assistant “will better enable cross-team collaboration and help those new to the firm with a more seamless and efficient people-navigation experience.”
Slack is also incorporating generative AI into the development of Slack GPT, an AI assistant designed to help employees work smarter not harder. The platform incorporates a range of AI capabilities, such as conversation summaries and writing assistance, to enhance user productivity.
These examples are just the tip of the iceberg. Soon hundreds of millions of Microsoft 365 users will have access to Business Chat, an agent that joins the user in their work, striving to make sense of their Microsoft 365 data. Employees can prompt the assistant to do everything from developing status report summaries based on meeting transcripts and email communication to identifying flaws in strategy and coming up with solutions.
This rapid deployment of AI agents is why Arvind Krishna, CEO of IBM, recently wrote that, “[p]eople working together with trusted A.I. will have a transformative effect on our economy and society … It’s time we embrace that partnership — and prepare our workforces for everything A.I. has to offer.” Simply put, organizations are experiencing exponential growth in the installation of AI-powered tools and firms that don’t adapt risk getting left behind.
AI Risks at Work
Unfortunately, remaining competitive also introduces significant risk for both employees and employers. For example, a 2022 UNESCO publication on “the effects of AI on the working lives of women” reports that AI in the recruitment process, for example, is excluding women from upward moves. One study the report cites that included 21 experiments consisting of over 60,000 targeted job advertisements found that “setting the user’s gender to ‘Female’ resulted in fewer instances of ads related to high-paying jobs than for users selecting ‘Male’ as their gender.” And even though this AI bias in recruitment and hiring is well-known, it’s not going away anytime soon. As the UNESCO report goes on to say, “A 2021 study showed evidence of job advertisements skewed by gender on Facebook even when the advertisers wanted a gender-balanced audience.” It’s often a matter of biased data which will continue to infect AI tools and threaten key workforce factors such as diversity, equity, and inclusion.
Discriminatory employment practices may be only one of a cocktail of legal risks that generative AI exposes organizations to. For example, OpenAI is facing its first defamation lawsuit as a result of allegations that ChatGPT produced harmful misinformation. Specifically, the system produced a summary of a real court case which included fabricated accusations of embezzlement against a radio host in Georgia. This highlights the negative impact on organizations for creating and sharing AI generated information. It underscores concerns about LLMs fabricating false and libelous content, resulting in reputational damage, loss of credibility, diminished customer trust, and serious legal repercussions.
In addition to concerns related to libel, there are risks associated with copyright and intellectual property infringements. Several high-profile legal cases have emerged where the developers of generative AI tools have been sued for the alleged improper use of licensed content. The presence of copyright and intellectual property infringements, coupled with the legal implications of such violations, poses significant risks for organizations utilizing generative AI products. Organizations can improperly use licensed content through generative AI by unknowingly engaging in activities such as plagiarism, unauthorized adaptations, commercial use without licensing, and misusing Creative Commons or open-source content, exposing themselves to potential legal consequences.
The large-scale deployment of AI also magnifies the risks of cyberattacks. The fear amongst cybersecurity experts is that generative AI could be used to identify and exploit vulnerabilities within business information systems, given the ability of LLMs to automate coding and bug detection, which could be used by malicious actors to break through security barriers. There’s also the fear of employees accidentally sharing sensitive data with third-party AI providers. A notable instance involves Samsung staff unintentionally leaking trade secrets through ChatGPT while using the LLM to review source code. Due to their failure to opt out of data sharing, confidential information was inadvertently provided to OpenAI. And even though Samsung and others are taking steps to restrict the use of third-party AI tools on company-owned devices, there’s still the concern that employees can leak information through the use of such systems on personal devices.
On top of these risks, businesses will soon have to navigate nascent, varied, and somewhat murky regulations. Anyone hiring in New York City, for instance, will have to ensure their AI-powered recruitment and hiring tech doesn’t violate the City’s “automated employment decision tool” law. To comply with the new law, employers will need to take various steps such as conducting third-party bias audits of their hiring tools and publicly disclosing the findings. AI regulation is also scaling up nationally with the Biden-Harris administration’s “Blueprint for an AI Bill of Rights” and internationally with the EU’s AI Act, which will mark a new era of regulation for employers.
This growing nebulous of evolving regulations and pitfalls is why thought leaders such as Gartner are strongly suggesting that businesses “proceed but don’t over pivot” and that they “create a task force reporting to the CIO and CEO” to plan a roadmap for a safe AI transformation that mitigates various legal, reputational, and workforce risks. Leaders dealing with this AI dilemma have important decision to make. On the one hand, there is a pressing competitive pressure to fully embrace AI. However, on the other hand, a growing concern is arising as the implementation of irresponsible AI can result in severe penalties, substantial damage to reputation, and significant operational setbacks. The concern is that in their quest to stay ahead, leaders may unknowingly introduce potential time bombs into their organization, which are poised to cause major problems once AI solutions are deployed and regulations take effect.
For example, the National Eating Disorder Association (NEDA) recently announced it was letting go of its hotline staff and replacing them with their new chatbot, Tessa. However, just days before making the transition, NEDA discovered that their system was promoting harmful advice such as encouraging people with eating disorders to restrict their calories and to lose one to two pounds per week. The World Bank spent $1 billion to develop and deploy an algorithmic system, called Takaful, to distribute financial assistance that Human Rights Watch now says ironically creates inequity. And two lawyers from New York are facing possible disciplinary action after using ChatGPT to draft a court filing that was found to have several references to previous cases that did not exist. These instances highlight the need for well-trained and well-supported employees at the center of this digital transformation. While AI can serve as a valuable assistant, it should not assume the leading position.
Principles for Responsible AI at Work
To help decision-makers avoid negative outcomes while also remaining competitive in the age of AI, we’ve devised several principles for a sustainable AI-powered workforce. The principles are a blend of ethical frameworks from institutions like the National Science Foundation as well as legal requirements related to employee monitoring and data privacy such as the Electronic Communications Privacy Act and the California Privacy Rights Act. The steps for ensuring responsible AI at work include:
- Informed Consent. Obtain voluntary and informed agreement from employees to participate in any AI-powered intervention after the employees are provided with all the relevant information about the initiative. This includes the program’s purpose, procedures, and potential risks and benefits.
- Aligned Interests. The goals, risks, and benefits for both the employer and employee are clearly articulated and aligned.
- Opt In & Easy Exits. Employees must opt into AI-powered programs without feeling forced or coerced, and they can easily withdraw from the program at any time without any negative consequences and without explanation.
- Conversational Transparency. When AI-based conversational agents are used, the agent should formally reveal any persuasive objectives the system aims to achieve through the dialogue with the employee.
- Debiased and Explainable AI. Explicitly outline the steps taken to remove, minimize, and mitigate bias in AI-powered employee interventions—especially for disadvantaged and vulnerable groups—and provide transparent explanations into how AI systems arrive at their decisions and actions.
- AI Training and Development. Provide continuous employee training and development to ensure the safe and responsible use of AI-powered tools.
- Health and Well-Being. Identify types of AI-induced stress, discomfort, or harm and articulate steps to minimize risks (e.g., how will the employer minimize stress caused by constant AI-powered monitoring of employee behavior).
- Data Collection. Identify what data will be collected, if data collection involves any invasive or intrusive procedures (e.g., the use of webcams in work-from-home situations), and what steps will be taken to minimize risk.
- Data. Disclose any intention to share personal data, with whom, and why.
- Privacy and Security. Articulate protocols for maintaining privacy, storing employee data securely, and what steps will be taken in the event of a privacy breach.
- Third Party Disclosure. Disclose all third parties used to provide and maintain AI assets, what the third party’s role is, and how the third party will ensure employee privacy.
- Communication. Inform employees about changes in data collection, data management, or data sharing as well as any changes in AI assets or third-party relationships.
- Laws and Regulations. Express ongoing commitment to comply with all laws and regulations related to employee data and the use of AI.
We encourage leaders to urgently adopt and develop this checklist in their organizations. By applying such principles, leaders can ensure rapid and responsible AI deployment.
Managing the Risks of Generative AI
Corporate leaders, academics, policymakers, and countless others are looking for ways to harness generative AI technology, which has the potential to transform the way we learn, work, and more. In business, generative AI has the potential to transform the way companies interact with customers and drive business growth. New research shows 67% of senior IT leaders are prioritizing generative AI for their business within the next 18 months, with one-third (33%) naming it as a top priority. Companies are exploring how it could impact every part of the business, including sales, customer service, marketing, commerce, IT, legal, HR, and others.
However, senior IT leaders need a trusted, data-secure way for their employees to use these technologies. Seventy-nine-percent of senior IT leaders reported concerns that these technologies bring the potential for security risks, and another 73% are concerned about biased outcomes. More broadly, organizations must recognize the need to ensure the ethical, transparent, and responsible use of these technologies.
A business using generative AI technology in an enterprise setting is different from consumers using it for private, individual use. Businesses need to adhere to regulations relevant to their respective industries (think: healthcare), and there’s a minefield of legal, financial, and ethical implications if the content generated is inaccurate, inaccessible, or offensive. For example, the risk of harm when an generative AI chatbot gives incorrect steps for cooking a recipe is much lower than when giving a field service worker instructions for repairing a piece of heavy machinery. If not designed and deployed with clear ethical guidelines, generative AI can have unintended consequences and potentially cause real harm.
Organizations need a clear and actionable framework for how to use generative AI and to align their generative AI goals with their businesses’ “jobs to be done,” including how generative AI will impact sales, marketing, commerce, service, and IT jobs.
In 2019, we published our trusted AI principles (transparency, fairness, responsibility, accountability, and reliability), meant to guide the development of ethical AI tools. These can apply to any organization investing in AI. But these principles only go so far if organizations lack an ethical AI practice to operationalize them into the development and adoption of AI technology. A mature ethical AI practice operationalizes its principles or values through responsible product development and deployment — uniting disciplines such as product management, data science, engineering, privacy, legal, user research, design, and accessibility — to mitigate the potential harms and maximize the social benefits of AI. There are models for how organizations can start, mature, and expand these practices, which provide clear roadmaps for how to build the infrastructure for ethical AI development.
But with the mainstream emergence — and accessibility — of generative AI, we recognized that organizations needed guidelines specific to the risks this specific technology presents. These guidelines don’t replace our principles, but instead act as a North Star for how they can be operationalized and put into practice as businesses develop products and services that use this new technology.
Guidelines for the ethical development of generative AI
Our new set of guidelines can help organizations evaluate generative AI’s risks and considerations as these tools gain mainstream adoption. They cover five focus areas.
Organizations need to be able to train AI models on their own data to deliver verifiable results that balance accuracy, precision, and recall (the model’s ability to correctly identify positive cases within a given dataset). It’s important to communicate when there is uncertainty regarding generative AI responses and enable people to validate them. This can be done by citing the sources where the model is pulling information from in order to create content, explaining why the AI gave the response it did, highlighting uncertainty, and creating guardrails preventing some tasks from being fully automated.
Making every effort to mitigate bias, toxicity, and harmful outputs by conducting bias, explainability, and robustness assessments is always a priority in AI. Organizations must protect the privacy of any personally identifying information present in the data used for training to prevent potential harm. Further, security assessments can help organizations identify vulnerabilities that may be exploited by bad actors (e.g., “do anything now” prompt injection attacks that have been used to override ChatGPT’s guardrails).
When collecting data to train and evaluate our models, respect data provenance and ensure there is consent to use that data. This can be done by leveraging open-source and user-provided data. And, when autonomously delivering outputs, it’s a necessity to be transparent that an AI has created the content. This can be done through watermarks on the content or through in-app messaging.
While there are some cases where it is best to fully automate processes, AI should more often play a supporting role. Today, generative AI is a great assistant. In industries where building trust is a top priority, such as in finance or healthcare, it’s important that humans be involved in decision-making — with the help of data-driven insights that an AI model may provide — to build trust and maintain transparency. Additionally, ensure the model’s outputs are accessible to all (e.g., generate ALT text to accompany images, text output is accessible to a screen reader). And of course, one must treat content contributors, creators, and data labelers with respect (e.g., fair wages, consent to use their work).
Language models are described as “large” based on the number of values or parameters it uses. Some of these large language models (LLMs) have hundreds of billions of parameters and use a lot of energy and water to train them. For example, GPT3 took 1.287 gigawatt hours or about as much electricity to power 120 U.S. homes for a year, and 700,000 liters of clean freshwater.
When considering AI models, larger doesn’t always mean better. As we develop our own models, we will strive to minimize the size of our models while maximizing accuracy by training on models on large amounts of high-quality CRM data. This will help reduce the carbon footprint because less computation is required, which means less energy consumption from data centers and carbon emission.
Integrating generative AI
Most organizations will integrate generative AI tools rather than build their own. Here are some tactical tips for safely integrating generative AI in business applications to drive business results:
Use zero-party or first-party data
Companies should train generative AI tools using zero-party data — data that customers share proactively — and first-party data, which they collect directly. Strong data provenance is key to ensuring models are accurate, original, and trusted. Relying on third-party data, or information obtained from external sources, to train AI tools makes it difficult to ensure that output is accurate.
For example, data brokers may have old data, incorrectly combine data from devices or accounts that don’t belong to the same person, and/or make inaccurate inferences based on the data. This applies for our customers when we are grounding the models in their data. So in Marketing Cloud, if the data in a customer’s CRM all came from data brokers, the personalization may be wrong.
Keep data fresh and well-labeled
AI is only as good as the data it’s trained on. Models that generate responses to customer support queries will produce inaccurate or out-of-date results if the content it is grounded in is old, incomplete, and inaccurate. This can lead to hallucinations, in which a tool confidently asserts that a falsehood is real. Training data that contains bias will result in tools that propagate bias.
Companies must review all datasets and documents that will be used to train models, and remove biased, toxic, and false elements. This process of curation is key to principles of safety and accuracy.
Ensure there’s a human in the loop
Just because something can be automated doesn’t mean it should be. Generative AI tools aren’t always capable of understanding emotional or business context, or knowing when they’re wrong or damaging.
Humans need to be involved to review outputs for accuracy, suss out bias, and ensure models are operating as intended. More broadly, generative AI should be seen as a way to augment human capabilities and empower communities, not replace or displace them.
Companies play a critical role in responsibly adopting generative AI, and integrating these tools in ways that enhance, not diminish, the working experience of their employees, and their customers. This comes back to ensuring the responsible use of AI in maintaining accuracy, safety, honesty, empowerment, and sustainability, mitigating risks, and eliminating biased outcomes. And, the commitment should extend beyond immediate corporate interests, encompassing broader societal responsibilities and ethical AI practices.
Test, test, test
Generative AI cannot operate on a set-it-and-forget-it basis — the tools need constant oversight. Companies can start by looking for ways to automate the review process by collecting metadata on AI systems and developing standard mitigations for specific risks.
Ultimately, humans also need to be involved in checking output for accuracy, bias and hallucinations. Companies can consider investing in ethical AI training for front-line engineers and managers so they’re prepared to assess AI tools. If resources are constrained, they can prioritize testing models that have the most potential to cause harm.
Listening to employees, trusted advisors, and impacted communities is key to identifying risks and course-correcting. Companies can create a variety of pathways for employees to report concerns, such as an anonymous hotline, a mailing list, a dedicated Slack or social media channel or focus groups. Creating incentives for employees to report issues can also be effective.
Some organizations have formed ethics advisory councils — composed of employees from across the company, external experts, or a mix of both — to weigh in on AI development. Finally, having open lines of communication with community stakeholders is key to avoiding unintended consequences.
• • •
With generative AI going mainstream, enterprises have the responsibility to ensure that they’re using this technology ethically and mitigating potential harm. By committing to guidelines and having guardrails in advance, companies can ensure that the tools they deploy are accurate, safe and trusted, and that they help humans flourish.
Generative AI is evolving quickly, so the concrete steps businesses need to take will evolve over time. But sticking to a firm ethical framework can help organizations navigate this period of rapid transformation.