Since 2021, Microsoft's AI security team has tested more than 100 generative AI products to look for weaknesses and ethical issues. Their findings challenge some common assumptions about AI safety and highlight the continued importance of human expertise.
It turns out that the most effective attacks are not always the most sophisticated ones. “Real hackers don’t compute gradients but use rapid engineering,” notes a study cited in the Microsoft report, which compared AI security research to real-world practice. In one test, the team successfully bypassed an image generator's security features simply by hiding harmful instructions within the image's text—no complicated math required.
While Microsoft has developed PyRIT, an open source tool that can automate security testing, the team emphasizes that human judgment cannot be replaced. This became especially apparent when they tested how the chatbot handled sensitive situations, such as talking to someone who was emotionally distressed. Evaluating these scenarios requires both psychological expertise and a deep understanding of the potential mental health impacts.
The team also relied on human insight when investigating AI bias. In one example, they examined gender bias in an image generator by creating images of different occupations (without specifying gender).
New security challenges emerge
The integration of artificial intelligence into everyday applications creates new vulnerabilities. In one test, the team successfully manipulated the language model to create convincing fraud scenarios. When combined with text-to-speech technology, this creates a system that can interact with people in dangerously realistic ways.
Risks are not limited to problems specific to artificial intelligence. The team discovered a legacy security vulnerability (SSRF) in an artificial intelligence video processing tool, demonstrating that these systems face old and new security challenges.
Ongoing security needs
The study focuses specifically on “responsible AI” risks, i.e. situations where AI systems could generate harmful or ethically questionable content. These questions are particularly difficult to address because they often rely heavily on context and personal interpretation.
The Microsoft team found that unintentional exposure of ordinary users to problematic content was more concerning than intentional attacks because it suggested that security measures were not working as expected during normal use.
The findings make it clear that AI safety is not a one-time fix. Microsoft recommends continuing to find and fix vulnerabilities, followed by more testing. This would need to be backed by regulations and financial incentives, making successful attacks more expensive, they suggested.
The research team says there are still several key questions that need to be answered: How do we identify and control potentially dangerous AI capabilities, such as persuasion and deception? How do we adapt safety testing to different languages and cultures? How can companies share this in a standardized way? Their methods and results?
AI courses are suitable for people who are interested in artificial intelligence technology, including but not limited to students, engineers, data scientists, developers, and professionals in AI technology.
The course content ranges from basic to advanced. Beginners can choose basic courses and gradually go into more complex algorithms and applications.
Learning AI requires a certain mathematical foundation (such as linear algebra, probability theory, calculus, etc.), as well as programming knowledge (Python is the most commonly used programming language).
You will learn the core concepts and technologies in the fields of natural language processing, computer vision, data analysis, and master the use of AI tools and frameworks for practical development.
You can work as a data scientist, machine learning engineer, AI researcher, or apply AI technology to innovate in all walks of life.