The TeamPCP extortion group began advertising stolen Mistral AI source code repositories on dark web forums on 14 May, claiming the access was obtained as a downstream consequence of the Shai-Hulud npm supply chain campaign — a coordinated attack on AI and developer tooling infrastructure that preceded and is related to the TanStack OIDC hijack. The advertised material reportedly includes Mistral’s internal model training code, API infrastructure components, and developer tooling repositories.
What Was Accessed
TeamPCP’s advertisement claims access to the following categories of Mistral AI assets:
- Model training repositories: Code related to the training pipelines for Mistral’s foundation models, potentially including architecture configurations and fine-tuning frameworks
- API infrastructure: Internal source code for Mistral’s API gateway and serving infrastructure
- Internal developer tooling: Evaluation frameworks, benchmark runners, and internal utilities used in Mistral’s development workflow
Mistral AI acknowledged the incident via a brief statement on 15 May, confirming that it detected unauthorised access to certain development repositories and is investigating the scope of the breach. The company stated that no user data or customer information was accessed.
The Shai-Hulud Connection
TeamPCP attributed the Mistral breach to Shai-Hulud, a coordinated supply chain campaign targeting npm packages used by AI companies and developer tool vendors. The campaign — a related but distinct operation from the TanStack OIDC hijack — compromised developer machines at multiple AI-adjacent organisations through malicious dependencies injected into AI development toolchains.
The Shai-Hulud campaign appears to be financially motivated, with TeamPCP operating as the monetisation arm: repositories obtained through developer machine compromise are being packaged and sold to the highest bidder rather than used operationally. Hudson Rock’s cyber intelligence team flagged TeamPCP advertisements appearing on four separate dark web forums.
Why AI Source Code Theft Matters
The value of proprietary AI model training code extends beyond competitive intelligence. Stolen training code can reveal:
- Architectural innovations that competitors can replicate without independent development effort
- Data pipeline details that might reveal training data provenance, including any potentially problematic data sources
- Alignment and safety techniques that, if known to adversaries, could inform attempts to fine-tune misaligned versions of similar architectures
For enterprise security professionals, the incident reinforces that AI companies are high-value targets for source code theft, and that supply chain compromise of developer tooling — rather than direct infrastructure attacks — is the current preferred vector.
Recommended Actions
- Source code inventory: Organisations handling proprietary source code should assess their exposure to npm supply chain compromise. Any developer machine that ran
npm installagainst AI-related packages in the past 30 days should be reviewed. - Repository access audit: Review git access logs for any internal code repositories for access from developer machines that may have been compromised via supply chain attack. Anomalous clone or pull operations from developer machines outside normal working hours are relevant signals.
- Mistral API users: If your organisation uses the Mistral API, there is no indication from current disclosures that API keys or user data were accessed. However, monitor Mistral’s official security communications for updates.
- Third-party AI vendor risk: Review your organisation’s third-party AI vendor risk assessments. Source code breaches at AI vendors can indicate systemic security posture issues worth factoring into procurement and dependency decisions.
Share this article